Informatica 41 (2017) 507–516 507
Landmarking-Based Unsupervised Clustering of Human Faces Manifesting
Labio-Schisis Dysmorphisms
Daniele Conti
Department of Applied Science and Technology, Politecnico di Torino
corso Duca degli Abruzzi 24, 10129 Torino, Italy
Luca Bonacina
Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano
Piazza Leonardo da Vinci 32, 20133 Milano, Italy
Antonio Froio
Department of Energy, Politecnico di Torino, corso Duca degli Abruzzi 24, 10129 Torino, Italy
Federica Marcolin and Enrico Vezzetti
Department of Management and Production Engineering, Politecnico di Torino
corso Duca degli Abruzzi 24, 10129 Torino, Italy
E-mail: federica.marcolin@polito.it
Domenico Speranza
Dipartimento di Ingegneria Civile e Meccanica, Universit degli Studi di Cassino e del Lazio Meridionale
Viale dell’Universit, 03043 Cassino (Frosinone), Italy
Keywords: facial dysmorphism, labio-schisis, diagnosis, feature extraction, landmarking, clustering, D-MST, artificial
intelligence, decision support
Received: July 20, 2016
Ultrasound scans, Computed Axial Tomography, Magnetic Resonance Imaging are only few examples
of medical imaging tools boosting physicians in diagnosing a wide range of pathologies. Anyway, no
standard methodology has been defined yet to extensively exploit them and current diagnoses procedures
are still carried out mainly relying on physician’s experience. Although the human contribution is always
fundamental, it is self-evident that an automatic procedure for image analysis would allow a more rapid
and effective identification of dysmorphisms. Moving toward this purpose, in this work we address the
problem of feature extraction devoted to the detection of specific diseases involving facial dysmorphisms.
In particular, a bounded Depth Minimum Steiner Trees (D-MST) clustering algorithm is presented for
discriminating groups of individuals relying on the manifestation/absence of the labio-schisis pathology,
commonly called cleft lip. The analysis of three-dimensional facial surfaces via Differential Geometry is
adopted to extract landmarks. The extracted geometrical information is furthermore elaborated to feed the
unsupervised clustering algorithm and produce the classification. The clustering returns the probability of
being affected by the pathology, allowing physicians to focus their attention on risky individuals for further
analysis.
Povzetek: Predstavljena je D-MST metoda za nenadzorovano grupiranje slik obrazov za diagnosticiranje.
1 Introduction
Medical imaging has seen an important enhancement in the
past decades thanks to various technological achievements.
Magnetic Resonance Imaging (MRI), Computed Axial To-
mography (CAT), X-ray imaging, Ultrasound scans ima-
ging (US) provide physicians with valuable information to
diagnostic purpose. In particular, foetal diseases attracted
attentions and efforts with the common aim to improve the
current diagnosis techniques, fostered by the objective of
defining a tailored therapy as early as possible. A cru-
cial role in this activity is played by three-dimensional
ultrasound scans, which could provide in-depth detailed
images of foetal morphology in a safe and non-invasive
way. Despite technological improvements, medical image-
driven diagnosis suffers the deficiency/absence of automa-
ted computer science treatment, even for diseases such as
Fetal Alcohol Syndrome (FAS) and labio-schisis [1–5].
This work aims to provide a methodology and a tool for
supporting the diagnosis of labio-schisis pathology (cleft
lip), which has been chosen due to its relatively large in-
cidence in the population [6]. This task is conceived for
prenatal diagnosis and stems from a recently developed
work [7], in which an automatic procedure was designed
to process a stack of 2D ultrasound scans of foetal faces
508 Informatica 41 (2017) 507–516 D. Conti et al.
by transforming the standard DICOM images in a PLY 3D
model. The core of the proposed method relies on the clus-
tering technique. Common algorithms for unsupervised
data clustering belong to two main categories: partitioning
algorithms and hierarchical clusterings [8]. Algorithms in
the first class, such as K-means or the recently proposed
Affinity Propagation (AP) [9], define a subset of indivi-
duals called centroids, i.e. the class exemplars, to which
any other node is compared. Hierarchical clustering algo-
rithms, such as Single Linkage, compare couples of indivi-
duals and merge the closest in a class, thus creating a chain
of hierarchical dependencies. In the first case, the expected
number of classes, i.e. of centroids, should be a priori
defined (except for Affinity Propagation), while in the se-
cond case the pruning of the hierarchical tree specifies how
many clusters to be returned [10]. Among them, the boun-
ded Depth Minimum Steiner Trees (D-MST) unsupervised
clustering algorithm is chosen for this study [11], [12].
For privacy reasons, after the feasibility test on an ideal
foetuses dataset, the public Bosphorus database was adop-
ted, containing facial depth maps of 105 adult individuals
showing the seven fundamental facial expressions, [13].
The defect was artificially simulated on the faces by mo-
difying some Bosphorus point clouds. This way, seven
artificial faces were generated with left-sided and right-
sided labio-schisis. The algorithm is designed to be robust
against different defect types.
The work is structured as follows. Firstly, an outline of
geometrical face description formalization is presented to-
gether with related feature extraction aimed at landmarks
localization. Then, information coming from geometrical
descriptors are exploited to feed the unsupervised D-MST
clustering algorithm for discriminating individuals accor-
ding to the presence/absence of the pathology.
2 Methods
The algorithm is meant to detect the presence/absence of
cleft lip in a query face. It is designed to work with three-
dimensional foetal faces obtained through automatic ela-
boration of bidimensional ultrasound scan stacks. On the
other hand, it has been extensively tested with a large size
adult individuals dataset.
2.1 Mathematical background
Bosphorus database provides coordinates of facial point
clouds, obtained through laser scans, as a binary file. A
pre-built routine is provided together with data for reading
binary files, extract cloud points data, and return informa-
tion as a matrix containing the Cartesian coordinate of each
point. The facial surface can be seen from the mathematical
standpoint as the locus defined as
z ∈ R3 : z = f(u, v)
and it can be referred to as a free-form surface. A free-
form surface is required to be smooth, with normal vec-
tor defined almost everywhere but edges, cusps, etc., but
not belonging to a simple mathematical class of surfaces,
like conics for example. Anyway, it can be divided in sub-
domains, each of them treatable as a linear combination of
simple geometries. Thus, we define a surface patch divided
in domains as an n-tuple of functions:
f(u, v) = (f1(u, v), f2(u, v), ..., fn(u, v)). (1)
Taking advantage of this definition, in order to objecti-
vely compare one face to another, the surface is point-by-
point mapped-on with entities belonging to the Differen-
tial Geometry domain, here called geometrical descriptors.
Twelve different geometrical descriptors, together with first
and second derivatives, are chosen: three coefficients of the
first fundamental form, i.e. E,G, F , three coefficients of
the second fundamental form, i.e. e, f, g, the Gaussian cur-
vature K, the mean curvature H , the principal curvatures
k1 and k2, the shape index S and the curvedness index C.
In the following section we go through the adopted geome-
trical descriptors definitions [14, 20].
2.2 Geometrical descriptors
A free-form surface is not an Euclidean geometry. Thus,
distances on a face cannot be computed with the standard
formula s2 =
∑d
i=0(ui − vi)2. The first fundamental
form, also called Riemann metric, allows to define equi-
valent concept of distance upon a non-Euclidean surface.
For d = 2, the infinitesimal distance element ds can be de-
fined as ds2 = Edu2 + 2Fdudv + Gdv2. E, F , G are
the first fundamental form coefficients. They can also be
expressed in terms of partial derivatives as
E = ‖fu‖2, (2)
F = 〈fu, fv〉, (3)
G = ‖fv‖2, (4)
where fu = ∂f∂u . Moreover, by defining the normal unit
vector in point (u, v) belonging to the face domain as
N(u, v) =
fu × fv
|fu × fv|
(u, v), (5)
we can also introduce the second fundamental form as
ds2 = edu2 + 2fdudv + gdv2, with
e = 〈N, fuu〉, (6)
f = 〈N, fuv〉, (7)
g = 〈N, fvv〉, (8)
where 〈· 〉 denotes the scalar product. In order to intro-
duce curvatures, let us consider the tangent plane Tp(f)
to f in point p = f(u0, v0); it can be defined as the
two-dimensional vector subspace Df(u, v) ⊂ R3, where
D is the functional differential operator. For each point
Landmarking-Based Unsupervised Clustering of. . . Informatica 41 (2017) 507–516 509
p, there exists a set of orthonormal vectors {e1, e2} for
the tangent plane Tp, such that DNp(e1) = −k1e1 and
DNp(e2) = −k2e2, where k1 and k2 are called the princi-
pal curvatures and e1 and e2 the principals directions at p.
In terms of the principal curvatures, Gaussian curvature K
and mean curvature H can be introduced:
K = k1k2 =
eg − f2
EG− F 2 , (9)
H =
k1 + k2
2
=
eG− 2fF + gE
2(EG− F 2) . (10)
Thus, the principal curvatures can be obtained as the
roots of the quadratic equation k2 − 2Hk + K = 0, re-
sulting in
k1 = H +
√
H2 −K (11)
and
k2 = H −
√
H2 −K. (12)
An insightful method for evaluating curvatures was in-
troduced by Koenderink and van Doorn [15], who defined
the shape index S and the curvedness index C. They can
be expressed in terms of the principal curvatures:
S = − 2
π
arctan
k1 + k2
k1 − k2
, with S ∈ [−1,+1], k1 > k2,
(13)
C =
√
k21 + k
2
2
2
. (14)
The range spanned by the shape index can be partitioned
into nine different intervals, spanning from cup to dome,
each of them representing a particular shape. The cur-
vedness index provides information about how gently the
surface bends. Differently from other geometrical descrip-
tors such as the shape index, it is not independent on the
unit length and has the dimension of a reciprocal length.
These geometrical descriptors are computed for each point
of the face and exploited for both landmarking and cluste-
ring phases.
2.3 Landmarking
Geometrical descriptors are suitable to be mapped point-
by-point on facial surfaces. So, by computing their values
for all individuals in the dataset, a distribution of their local
behaviour is obtained. Such a statistics can be exploited
as characteristic information of the facial region and used
to automatically localize facial landmarks. Landmarks are
typical facial points, such as the nose tip, i.e. the pronasal,
the nose basis, i.e. the subnasal, the internal and external
eye extrema, i.e. the endocanthions and exocanthions. Fi-
gure 1 shows the most renown landmarks.
Landmarks can be automatically detected by setting
tailored thresholds, empirically defined, in specific facial
Figure 1: Main soft-tissue landmarks. From top to bottom:
IEsx/IEdx, left and right Inner Eyebrows. N, Nasion.
ENsx/ENdx, right and left Endochantions. ALAsx/ALArx,
Alae. PN, Pronasal. SN, Subnasal. LS, Labrum superior.
CHsx/CHdx, righty and left Cheilions. LI, Labrum Infe-
rior.
areas (where each landmark is more likely to be) for all ge-
ometrical descriptor facial maps. Focusing on the specific
problem addressed in this work, we developed a program
for automatically extracting pronasal and labrum superior
points. For further details about landmarking, please refer
to [16–23].
2.4 Clustering
In order to perform the clustering, the input database is put
in the form of a N ×M matrix with a row for each indi-
vidual to be classified and as many columns as the number
of geometrical descriptors to be exploited for facial des-
cription purpose. For instance, considering all individuals
available, if we report values of all seventeen geometrical
descriptors expressed in the labrum superior landmark, a
105 x 17 matrix is created. In our case, values of a subset
of geometrical descriptors expressed in an arbitrary surface
portion, covering the central part of the upper lip and depar-
ting on a straight line from the identified LS landmark, are
considered, in order to catch sufficient information about
the possible presence of labio-schisis pathology. Cluste-
ring algorithms require the definition of a dissimilarity me-
asure to be used for one-to-one face comparisons. It leads
to the so called Dissimilarity Matrix, S, whose entries si,j
with i, j = 1, ..., N are the dissimilarities between any cou-
ple of individuals i and j. si,j can be defined in different
ways, depending on the kind of data to be clustered [24].
Spearman’s correlation rank distance ρi,j is chosen as dis-
similarity measure, in order to overcome problems related
to descriptors, such as the shape and curvedness indexes,
lying on different domains and with different measure sca-
les. In particular:
510 Informatica 41 (2017) 507–516 D. Conti et al.
ρi,j =
M∑
m=1
(xi,m − x̄i) · (xj,m − x̄j)√∑
m (xi,m − x̄i)2 ·
∑
m (xj,m − x̄j)2
,
(15)
si,j = 1− ρi,j . (16)
In 15, the usual definition of correlation is given, being
ρ the Spearman’s rank correlation coefficient. In particular,
the variable x is the data rank, instead of the bare data it-
self as for Pearson correlation coefficient, i and j identify
individuals and index m runs over geometrical descriptors
values. In the second equation, s is the dissimilarity. The
input dataset is treated as a fully connected weighted graph
G(n, e), with n = 1, ..., N , e = {(ni, nj)}, i, j = 1, ..., N ,
with N individuals and N(N−1)2 weighted edges e, whose
weight is the dissimilarity si,j between the two individu-
als. A fictitious node, called root, is added to this graph
and connected to all other nodes by a weight λ, empirically
defined. From a Physical viewpoint, λ can be interpreted
as the chemical potential of the system, i.e. the cost for ad-
ding an individual to the system itself, and it governs the
most probable number of clusters returned. On the other
hand, a depth parameter D is introduced to drive the final
output. It is a constraint representing the maximum obser-
vable depth, namely the distance, in terms of nodes, from
the root to the external leaves of clusters. Thanks to this
additional parameter, D-MST interpolates between the Af-
finity Propagation algorithm, returning an arbitrary number
of spherical clusters with D = 2, in which leaves are di-
rectly connected to the centroids via edge means with com-
parable dissimilarities, and the Single Linkage algorithm,
in which D > N . In order to perform a classification, two
variables (di, πi) are assigned to any node and exploited to
define an objective function to be minimized for detecting
the optimal spanning tree T ∗ in the graph. The variable
di ∈ [2, N ], di ≤ D is the distance, in terms of number of
nodes, from the root, and assumes discrete values; variable
πi = j, j ∈ [1, N ], j 6= i is a pointer tracking the ancestor
of node i. Thus, the cost function is:
E({di, πi}Ni=1) =∑
i
si,πi+
+
∑
i,j∈∂i
(hi,j(πi, πj , di, dj) + hj,i(πj , πi, dj , di)),
(17)
where hij is defined as
hi,j =
{
0 {πi = j ⇒ di = dj + 1}
−∞ else, (18)
and imposes an artificial constraint to the cost function that
requires the returned optimum tree to be connected. In this
terms, the probability of observing a configuration of vari-
able for the optimum tree is given by the Boltzmann weight
P ({πi, di}) ∝ e−βE({πi,di}) (19)
and it is maximized by a message passing algorithm des-
cribed in [25] and [26].
3 Results
3.1 Preliminary analysis
Individuals’ faces are reported in Bosphorus database as
point clouds pre-ordered on a square grid and with the same
orientation, in particular with nose oriented alongside with
z-axis, x-axis aligned with chin-forehead line and, conse-
quently, y-axis aligned from cheek to cheek. A first ana-
lysis of data contained in Bosphorus database is conducted
by inspection, in order to examine the facial points suita-
ble to our purpose. Indeed, referring only to faces with no
expression, some facial point clouds showed degradation
and low accuracy in shaping the face itself. In particular,
it is not unusual to encounter data with a rough mouth sur-
face that has no actual correspondence to the individual’s
picture accompanying data. Thus, all corrupted data were
excluded from further analysis, resulting in an input matrix
collecting 74 healthy individuals.
As a preliminary step, all facial point clouds are cropped
in size, limiting the region of interest to a squared area. A
four-pixels-side mean-filter is then applied in order to re-
duce noise and smooth surface peaks, where a pixel is in-
tended as the squared surface area wrapping a point of the
face. Moreover, the fact of having pre-oriented faces allo-
wed us to avoid a pre-processing step aimed to provide data
with a standard orientation. Most of all, it allowed to easily
identify the central region of the face, by looking at those
points with relatively higher z-coordinate. This way, the fa-
cial area containing nose and a mouth portion is identified
and exploited for further analysis.
3.2 Computing geometrical descriptors and
landmarking
Geometrical Descriptors are point-by-point computed, star-
ting from derivatives. They are obtained by computing the
surface gradients, along x and y directions, then by aver-
aging values obtained in a ten-pixels-side window centred
into the point of interest. All other geometrical descriptors
can be obtained, as previously shown in Methods section,
starting from first and second derivatives, and are easily
computed for each point of the facial surface.
Once geometrical descriptors are obtained, they are ex-
ploited to define where the pronasal landmark and the la-
brum superior landmark are placed upon the surface. Our
attention is focused most of all on the latter landmark, as
it would affect the chosen area of investigation for the cleft
Landmarking-Based Unsupervised Clustering of. . . Informatica 41 (2017) 507–516 511
Figure 2: Geometrical descriptors mapped on a face. From
top to bottom: bare facial surface, shape index S, second
fundamental form f and derivative along x.
lip dysmorphism clustering. Indeed, the pronasal is help-
ful in proceeding with an accurate identification of the la-
brum superior itself. Starting from the central area of the
face identified previously relying on z-coordinate, we set
empirical constraints to values assumed by meaningful ge-
ometrical descriptors. They are the descriptors that, in the
region of interest, present a characteristic behaviour. Re-
ferring to the previous work [19] and assessing the choice
against this database, the shape index S, the second fun-
damental form coefficient f , and first derivatives along x
and y directions have been chosen (figure 2). This subset
of descriptors, conveniently constrained, leads to a 100%
accuracy in the automatic determination of the pronasal
landmark for the pre-processed database in exam.
Once the pronasal is identified, it is adopted to delimit
the region of interest for the labrum superior detection.
Approximatively half of the area going from chin to the
pronasal itself is taken into consideration to detect this se-
cond landmark. In this region, the previous procedure is
repeated changing only the geometrical descriptor adopted.
In such a case, the chosen information relies upon the shape
index S, the mean curvature H , the first derivative along x,
and the second derivative along x. In this case, the accuracy
reached by the algorithm is around 94%, but even when the
landmark obtained by the algorithm and the ground truth
landmark do not match perfectly, their relative distance re-
mains around a few pixels. Thus, the error is not affecting
the final output of the algorithm.
3.3 Prenatal applicability
In its original intention, the present work has been designed
for pre-birth diagnosis of rare diseases manifesting facial
dysmorphisms. Labio-schisis presents high incidence and
it is clearly detectable through ultrasound-scans when the
foetus is affected. Another connected pathology is palato-
schisis. It is more difficult to be observed by US-scans
inspection and its current diagnosis techniques would be-
nefit of an automatic procedure for highlighting morpholo-
gical differences that are symptomatic of the disease itself.
In this perspective, the clustering procedure here proposed
could be efficient only if it relies on an effective detection
of the foetus’s facial landmarks, which are clearly fuzzier
than those of an adult. In our work, we tested the landmar-
king algorithm on the limited amount of real foetus data
available and found that, after a tiny rearrangement of con-
straints imposed to the same subset of geometrical descrip-
tors, the pronasal and the labrum superior landmarks were
successfully detected (figure 3). Although this result is not
statistically significant, it moves towards the application of
such kind of procedure to pre-birth diagnosis. Indeed, once
landmarks are identified correctly, the cleft lip manifesting
face morphology reports similar differential geometry pro-
perties and, thinking toward a clustering perspective, it is
totally equivalent to the case of adult individuals.
512 Informatica 41 (2017) 507–516 D. Conti et al.
Figure 3: Surface Landmarking. For each surface, the
white dot indicates the position of the labrum superior
landmark. The comparison between faces not manifesting
(top) and manifesting (centre) the labio-schisis is presen-
ted. Moreover, it is shown the case of landmarking on a
foetal 3D model, (bottom). 0-valued points in the last fi-
gure indicate missing data.
3.4 Unsupervised clustering
Once the labrum superior landmark has been detected, the
area of interest for the labio-schisis pathology is identified.
In particular, a straight line laying on the upper lip is ta-
ken as a biometric information of the individual. The line
length and width can be arbitrary, provided that it spans
most of the lip itself, without invading other face regions.
Therefore, if the cleft lip is present in the individual under
investigation, it will also be in the region in exam. More-
over, the pre-orientation of individuals’ face simplifies the
identification of the lip line, avoiding a useless detection of
its complete morphology.
A subset of meaningful geometrical descriptors expres-
sed in any point composing this line is then stored into a
row vector, building a matrix collecting all individuals in
exam. The opposite in sign of the shape index S and of
the coefficient of the first fundamental formG, the first and
second derivatives along y of the free-form surface are suf-
ficient to obtain the convergence of the clustering algorithm
toward a successful classification. In our specific study, the
chosen line is forty pixels-long and its width is three pixels,
pinched at the labrum superior landmark. This choice is
useful for the application of a median-filter on any three-
pixels-side square of geometrical values spanning the line.
Median filter smooths the descriptor behaviour along the
surface, without adding artificial information to the one ex-
tracted from the geometrical analysis of the surface. In the
end, the input matrixM presents a number of rows equal to
74 (number of healthy faces) plus 7 (number of artificially-
induced cleft lip-affected faces) and a number of columns
equal to 40 times the number of chosen geometrical des-
criptors.
Starting from M , the so called similarity matrix is com-
puted. As previously mentioned, the similarity matrix S is
a squared symmetric matrix which reports, with any of its
entries sij , a measure of distance between any couple of
individuals. In particular, the most suitable choice for the
kind of data handled in this work is the Spearman’s corre-
lation rank distance, computed as sij = 1− ρij , where ρ is
shown explicitly in equation 15. Once the similarity matrix
is computed and the specific clustering depth D is set, the
unsupervised clustering D-MST can be run.
As specified previously in the section Methods, D-MST
is governed by an external parameter λ influencing the
number of identified clusters. Lower values of λwould lead
toward many clusters and non-linked individuals, while lar-
ger values of the parameter would return a single cluster
collecting all nodes of the graph. In general, the maximum
value of dissimilarities sij found in the S matrix is taken
as upper bound for the parameter. In order to identify the
most proper value of lambda to be chosen, stability regi-
ons of the clustering algorithm are investigated. They are
intended as regions of the parameter space in which the
algorithm converges toward a stable solution, in terms of
number of clusters, despite the parameter change. So, the
range between the minimum value of dissimilarity, exclu-
Landmarking-Based Unsupervised Clustering of. . . Informatica 41 (2017) 507–516 513
Figure 4: Number of clusters versus λ, for D = 2. Pla-
teaux curve. The higher the λ value, the lower the num-
ber of clusters. For the plateaux associated to 5, 4 and 2
clusters, the relative clustering are overlaid. In particular,
observing plateau a), two cleft lip clusters are detected cor-
rectly, even separating left- and right-sided cleft lips; on the
contrary, healthy individuals population is divided in sub-
populations as well. Increasing λ, blue clusters do not form
a single class, while individuals affected by the pathology
are merged to healthy clusters.
ded zero, and the maximum one, is linearly split up into a
large number of bins. For each of them, the algorithm is
run fifty times, in order to return results with a statistical
significance. In such a way, what here is called plateaux
curve is built and it plots the number of clusters returned as
a function of the value of λ.
Such a procedure is repeated for the different depths in-
vestigated. Starting from depth D = 2, meaning a sin-
gle link between centroid and leaves, the plateaux curve is
obtained running the algorithm for values of lambda inclu-
ded into the range λ ∈ [0.1, 0.8] and spaced 0.01. Figure
4 shows the passage from nearly no clusters, for λ close
to 0, to a single cluster for λ large. The blue line shows
the mode of number of clusters, while the red one is the
average number of clusters with its standard deviation. An
example of clustering obtained for lambdas falling in the
specific plateaux range is shown; blue bullets represents
healthy individuals, while red ones are individuals affected
by the pathology.
The 5-clusters plauteaux, letter a) in figure 4, presents
three healthy individuals clusters and two cleft lip clusters.
Observing labels following bullets (not shown in the
figure), one can analyze deeper the structure of the two
red clusters and it is possible to appreciate how they are
divided according to the presence of right- and left-sided
cleft lip (see also figure 7). Proceeding to larger values
of lambda, the range highlighted with letter b) indicates
a region in which two of the healthy individuals clusters
merge together and then merge again with both the cleft-lip
clusters (letter c)). From this analysis it turns out how
2-MST unsupervised clustering is not able to unveil
the inner structure of the proposed data and, trying to
impose a spherical geometry, it returns more than one
sub-population for the healthy individual class.
The same procedure is repeated moving to depth D = 3.
Again the parameter λ is spanned from the minimum to
the maximum of the dissimilarities, looking for the largest
stability region of the algorithm. With higher clustering
depths, it is found how the decay toward a single cluster
plateaux is faster with respect to the spherical clustering.
Thus, in order to build a reliable plateaux curve, a finer
grating for lambda values is required, especially in the tran-
sition region. Moreover, at this depth it is quite common to
encounter outliers, i.e. single nodes that are not assigned to
any class. To this purpose, a green line is plotted as well,
showing the mode of number of clusters composed by at
least two nodes, in order to unveil the number of outliers
present in the clustering.
Figure 5 shows the plateaux curve for depth D = 3. In
such a case, two plateaux with no outliers are identified in
the transition region between none and one cluster. Here
lambda spans with a 0.002 step in order to track in detail
the plateaux behaviour, while in the other regions of the
curve a 0.005 step is kept. The most important parameter
region, i.e. that with three clusters and indicated with letter
b) in figure 5, discriminates well healthy individuals from
those manifesting cleft lip. In particular, left- and right-
sided cleft lips are clustered in two separated classes, while
healthy individuals create a unique cluster. Then, the algo-
rithm succeeds in identifying the investigated structure of
the proposed data.
In order to understand the robustness of the clustering
against the stochasticity of the algorithm, we compute the
probability for any node of being assigned to its own class.
Setting the value of lambda inside the plateaux, λ = 0.18,
the algorithm is run fifty times, building a statistics of the
Figure 5: Number of clusters versus λ, forD = 3. Plateaux
curve. The green solid line represents the obtained plateaux
curve of the resulting clusters with at least 2 nodes. A sin-
gle plateau, i.e. plateau b), is stable in converging toward a
solution with no outliers and reports the discrimination of
individuals with a pathology (see figure 7).
514 Informatica 41 (2017) 507–516 D. Conti et al.
Figure 6: Assignment Probability Plot. The probability of a
node to be assigned to its most frequent cluster throughout
50 runs of the clustering algorithm is plotted. No value
below 1 is found.
Figure 7: Resulting Clusters For D = 3. The 4-nodes clus-
ter includes all the right-sided cleft lip affected individuals;
an example of such individual surface is reported at the top
of the figure. On the other hand, the 3-nodes cluster inclu-
des left-sided cleft lip affected individuals. An example of
individual surface belonging to this class is reported at the
bottom of the figure.
class assignment for any node. Plotting the node assign-
ment probability, as shown in figure 6, it can be observed
how all individuals show a probability of being member of
their most frequent class equal to 1. Moreover, comparing
each clustering returned in subsequent runs with the first
one obtained, we can compute the overlap among cluste-
rings, finding that the algorithm converges always to the
same classifications, in terms of composition of the single
clustering. In conclusion, such a clustering overlap and as-
signment probability allow us to state with high confidence
the robustness of the classification returned.
Higher depths trees are also investigated, but results are
not reported here, as they do not improve sensibly those
obtained for D = 3. This means that, once the spherical
constraint is relaxed, data do not show a longer range
dependency and a three nodes correlation is able to catch
the inner structure of clusters.
4 Conclusions
This work proposes an innovative automatic methodology
for the diagnosis of cleft lip. The process is designed for
the detection of diseases in the prenatal phase, thus its fe-
asibility is tested upon the limitedly available ultrasound
scans data and then deeply investigated for a large dataset
of adult individuals. Bosphorus database is chosen for the
testing, which seven cleft lip-affected individuals are added
to, by artificially imposing the defect.
The algorithm maps each face with twelve differential
geometry descriptors plus first and second derivatives with
respect to x and y directions. It allows to determine fa-
cial landmarks, that enable face comparison. In this work,
only the pronasal and the labrum superior landmarks are
investigated. They are identified automatically by imposing
thresholds on values expressed by a subset of geometrical
descriptors. In particular, the labrum superior specifies the
region of interest for the actual diagnosis.
In the second part of the algorithm, the geometrical des-
criptors expressed in the labrum superior’s neighbourhood
are used to transform each face into a vector and create
the input matrix for the unsupervised clustering algorithm.
Any entrance of such built vector is used to perform the
comparison between couples of individuals and compute
their dissimilarity in terms of Spearman’s correlation rank
distance. In such a way, a squared symmetric matrix is
computed and provided as input for the clustering itself.
Eventually, D-MST clustering algorithm allows to inves-
tigate regions of convergence stability to a certain number
of clusters, the so called plateaux, imposing the maximum
depth, i.e. the inner dependencies structure, of any cluster
detected. It correctly separates left-sided and right-sided
cleft lips, thus showing accurate diagnosis results.
This algorithm also opens the route for the definition of
what is called normotype. The normotype can be consi-
dered as the representative face of a class of individuals,
Landmarking-Based Unsupervised Clustering of. . . Informatica 41 (2017) 507–516 515
collecting all the principal features distinguishing an in-
dividual as member of a particular category. The present
algorithm is able to collect all healthy individuals in a sin-
gle cluster, starting from features expressed in the lip re-
gion, in comparison to those manifesting a cleft lip. This
allows a formalization of the normotype features. On the
other hand, once the cleft lip population will account for a
sufficient number of members, the labio-schisis normotype
would be defined as well.
Other syndromes, like the Fetal Alchol Syndrome (FAS)
or the palato-schisis syndrome, will also be investigated
and geometrically formalized to be embedded in this algo-
rithm, which could be a multi-syndrome diagnosing tool.
References
[1] T. S. Douglas, T. E. M. Mutsvangwa, ”A Review
of Facial Image Analysis for Delineation of the Fa-
cial Phenotype Associated With Fetal Alcohol Syn-
drome,” Am J Med Genet Part A, vol. 152, no. 2, pp.
528536, 2010.
[2] S. Campbell, C. Lees, G. Moscoso, P. Hall, ”Ultra-
sound antenatal diagnosis of cleft palate by a new
technique: the 3D reverse face view,” Ultrasound Ob-
stet Gynecol, vol. 25, no. 1, pp. 1218, 2005.
[3] I. Aras, S. Olmez, S. Dogan, ”Comparative Evalua-
tion of Nasopharyngeal Airways of Unilateral Cleft
Lip and Palate Patients Using Three-Dimensional
and Two-Dimensional Methods,” The Cleft Palate-
Craniofacial Journal, vol. 49, no. 6, pp. e75e81,
2012.
[4] L. D. Platt, G. R. DeVore, D. H. Pretorius, ”Impro-
ving Cleft Palate/Cleft Lip Antenatal Diagnosis by
3-Dimensional Sonography,” J Ultrasound Med, vol.
25, no. 11, pp. 14231430, 2006.
[5] G. Tonni, M. Lituania, ”OmniView Algorithm, A
Novel 3-Dimensional Sonographic Technique in the
Study of the Fetal Hard and Soft Palates,” J Ultra-
sound Med, vol. 31, no. 2, pp. 313318, 2012.
[6] P. A. Mossey, J. Little, R. G. Munger, M. J. Dixon, W.
C. Shaw, ”Cleft lip and palate,” The Lancet, vol. 374,
no. 9703, pp. 1773-1785, 2009.
[7] L. Bonacina, D. Conti, A. Froio, F. Marcolin, E. Vez-
zetti, ”Automatic 3D Fetus Face Model Extraction
From Ultrasonography Through Histogram Proces-
sing”, submitted to IEEE Trans Biomed Eng.
[8] A. K. Jain, M. N. Murty, P. J. Flinn, ”Data Clustering:
A Review”, ACM Computing Surveys, vol. 31, no. 3,
1999.
[9] B. J. Frey, D. Dueck, ”Clustering by Passing Messa-
ges Between Data Points,” Science, vol. 315, no. 972,
2007.
[10] J.H. Jr. Ward, ”Hierarchical grouping to optimize an
objective function,” Journal of the American Statisti-
cal Association, vol. 58, no. 301, pp. 236-244, 1963.
[11] M. Bailly-Bechet, S. Bradde, A. Braunstein, A. Flax-
man, L. Foini, R. Zecchina, ”Clustering with shal-
lows trees,” Journal of Statistical Mechanics: The-
ory and Experiments, vol. 2009, doi:10.1088/1742-
5468/2009/12/P12010, 2009.
[12] M. Bailly-Bechet, C. Borgs, A. Braunstein, J. Chayes,
A. Dagkessamanskaia, J.-M. Franois, and R. Zec-
china. ”Finding undetected protein associations in
cell signaling by belief propagation,” Proceedings of
the National Academy of Sciences. 2010; 108(2):882-
7, doi:10.1073/pnas.1004751108
[13] The Bosphorus Database Official Web Site:
http://bosphorus.ee.boun.edu.tr/default.aspx
[14] E. Vezzetti, F. Marcolin, ”Geometrical descriptors for
human face morphological analysis and recognition,”
Robotics and Autonomous Systems, vol. 60, no. 6, pp.
928-939, 2012.
[15] J. J. Koenderick, A. J. van Doorn, ”Surface shape and
curvature scales”, Image and Vision Computing, vol.
10, no. 8, pp. 557-564, 1992.
[16] E. Vezzetti, D. Speranza, F. Marcolin, G. Fracastoro,
G. Buscicchio, ”Exploiting 3D Ultrasound for Fetal
Diagnostic Purpose Through Facial Landmarking,”
Image Anal Stereol, vol. 33, no. 3, pp. 167-188, 2014.
[17] E. Vezzetti, D. Speranza, F. Marcolin, G. Fracastoro,
”Diagnosing cleft lip pathology in 3d ultrasound: A
landmarking-based approach,” Image Anal Stereol,
vol. 35, no. 1, pp. 53-65, 2015.
[18] S. Moos, F. Marcolin, S. Tornincasa, E. Vezzetti, M.
G. Violante, G. Fracastoro, D. Speranza, F. Padula,
”Cleft lip pathology diagnosis and foetal landmark
extraction via 3D geometrical analysis,” Internatio-
nal Journal on Interactive Design and Manufacturing
(IJIDeM), vol. 11, no. 1, pp. 1-18, 2017.
[19] E. Vezzetti, F. Marcolin, ”Geometry-based 3D face
morphology analysis: soft-tissue landmark formali-
zation,” Multimed Tools Appl, vol. 68, no. 3, pp. 895-
929, 2012.
[20] E. Vezzetti, F. Marcolin, ”3D human face description:
landmarks measures and geometrical features,” Image
and Vision Computing, vol. 30, no. 10, pp. 698-712,
2012.
[21] E. Vezzetti, S. Moos, F. Marcolin, V. Stola, ”A pose-
independent method for 3D face landmark formaliza-
tion,” Computer Methods and Programs in Biomedi-
cine, vol. 198, no. 3, pp. 1078-1096, 2012.
516 Informatica 41 (2017) 507–516 D. Conti et al.
[22] E. Vezzetti, F. Marcolin, V. Stola, ”3D Human Face
Soft Tissues Landmarking Method: An Advanced
Approach,” Computers in Industry, vol. 64, no. 9, pp.
1326-1354, 2013.
[23] E. Vezzetti, F. Marcolin, ”3D Landmarking in Mul-
tiexpression Face Analysis: A Preliminary Study on
Eyebrows and Mouth,” Aesthetic Plastic Surgery, vol.
38, pp. 796-811, 2014.
[24] R. Xu, C. Donald, Clustering, Wiley-IEEE Press, No-
vember 2008.
[25] M. Bayati, A. Braunstein, R. Zecchina, ”A rigorous
analysis of the cavity equations for the minimum
spanning tree,” Journal of mathematical physics, vol.
49, no. 12, 2008.
[26] M. Bayati et al., ”Statistical mechanics of Steiner
trees,” Physical Review Letters, vol 101, no. 3, 2008.