https://doi.org/10.31449/inf.v47i9.4995 Informatica 47 (2023) 1–10 1 
Detecting Breast Cancer in X-RAY Images Using Image 
Segmentation Algorithm and Neural Networks 
 
Rasha Talib Gdeeb  
Department of Environmental Engineering, College of Engineering, University of Baghdad, Iraq 
E-mail: rasha.talib@coeng.uobaghdad.edu.iq 
Keywords: gaussian mixture, breast cancer, mammogram rays, median filter, neural networks 
Received:  June 30,2023 
Breast cancer presents a global health challenge   that is endangering the lives of women all over 
the world. Because of this, many researches are attempting to provide an early detection technique 
to lessen the danger that breast cancer may cause; with a potential impact of saving 30% of the 
afflicted populace. Mammography, employing X-ray irradiation, serves as a quintessential modality 
for the identification of breast anomalies including neoplastic obstructions, discomfort, and nipple 
exudations. Concurrently, deep learning, a subset of artificial intelligence, has garnered momentum 
in the realm of breast carcinoma diagnostics. This paradigm facilitates the automated detection and 
categorization of neoplastic formations within mammographic imagery, as well as other 
radiological techniques, by learning to discern patterns autonomously without explicit algorithmic 
instructions. Deep learning algorithms are capable of learning to detect patterns in medical images 
without being explicitly programmed. This technology is being used to detect breast cancer earlier 
and more accurately than ever before. With the help of deep learning, radiologists can identify 
suspicious lesions, classify them as benign or malignant, and even predict the risk of recurrence of a 
malignant tumor. Furthermore, it enables the visualization of tumors that might elude unaided 
ocular inspection. Several types of neural network architectures, including but not limited to 
conventional and artificial neural networks, have been deployed in various studies for neoplasm 
detection, this task needs a preprocessing task depending on image processing like filtering, images 
enhancement, and gray levels detection to isolate and detect even the smallest areas in X-RAY 
images. This search uses image processing and computer vision approach to detect and recognize 
tumor areas in an X-RAY with the aid of neural networks to classify the danger level of the disease 
automatically. 
Povzetek: Da bi zagotovili zgodnje odkrivanje tumorja na dojkah, raziskava pelje tehnike za 
avtomatsko izbiro zdravljenja na osnovi rentgenskih slik. Metoda v treh korakih vključuje primarno 
pomoč, kemično obdelavo in odstranitev. 
 
1 Introduction 
Breast cancer becomes one of the most dangerous 
nightmares threating women all over the world, early 
detection of cancer increases healing of it, it can save 30 
percent from infected women which is a big percentage. 
Dangerous of breast cancer comes from the fact that all 
the women do not know about it until they have a 
mammogram image for the breast. It can be detected 
personally in late stages. That means it is important to 
make a medical examination periodically to investigate 
the presence of any Cancerous lumps in breast tissue or 
underarm which can be an indicator for the existence of 
the tumor [1]. Mammogram rays are an X-RAY applied 
on the breast which can used to find any problems in the 
breast like tumor blocks in breast, pain, secretions from 
nipples. Mammogram rays can detect breast cancer 
early and decrease the death cases. mammogram 
imaging starts in 40 age and must done every 3 years to 
assure the not infection of it. In cases of Genetic disease 
history, it is important to take the mammogram imaging  
 
 
before 40 ages in the state of early tumor detection so it 
increases the recovery in early stages [2]. 
 
• Mammogram can help detecting early cancers 
before 3 years without watching any changes 
on breast or any symptoms, doctors usually 
asks for mammogram imaging in these 
symptoms: Secretions from the nipples in 
addition (or without) breast milk [3]-[6]. 
• Changing in nipples shape or reverse of the 
nipple to inside. 
• Changes in the breast skin. 
• Breast tumor (all of it or some parts). 
• Tumor on the lymph nodes 
 
There are two types of x-ray imaging used which are: 
 
• checking Imaging: here we need a periodically 
check to search for breast cancer for a woman 
has no signs of cancer infection, it is a routine 
2   Informatica 47 (2023) 1–10                                                                                                                                             R.T. Gdeeb 
check after the age of 40’s to early detection of 
it. 
• Diagnosis imaging: here we want to diagnosis 
the situation by a check for a special problem 
in breast like feeling a tumor as example. 
When a woman got the mammogram image, 
the result appears after 24 hours with a special 
code giving the resulted situation which are: 
• BIRADS0: this code means that the check is 
not completed, it needs repeating or more 
checking for assurance of the result. 
• BIRADS1: the check result is negative, there is 
any signs for cancer presence. 
• BIRADS2: the situation is normal and there is 
no cancer, and maybe there will be an 
existence of calcifications and benign lump in 
the breast. 
• BIRADS3: it means the existence of benign 
lump; it is important to repeat the check after 6 
months to make sure that the tumor has not 
changed to another type. 
• BIRADS4: means a high probability of cancer 
existence, here we need to have a biopsy (a 
sample of a cell) by the doctor. 
• BIRADS5: high probability to 90% for cancer 
tumor, here we need to check the tissues too. 
 
2 Objective and expected research 
contribution 
The importance of our search is to create an effective 
user interface that can use by doctors to detect the 
condition of the breast cancer and the medical treatment 
required. The input of the system is an x-ray image 
changed to an RGB image to handle with MATLAB. 
Several techniques were used to solve all the problems 
that the image can suffer of. We can use our system to 
solve the wrong medical analysis for a wide range of 
cancer situations like breast cancer, brain cancer, and 
skin cancer [7]-[9]. 
The research objectives are summarized in the 
following points: 
• Build an effective user interface to detect the 
breast cancer using MATLAB program. 
• Decide and choose the best filtering methods 
to remove the noise of the images. 
• Build a good feature extractor which can give 
the best and the most accurate results. 
 
3 Research methods and materials 
If we want to decide the situation of the cancer, we must 
check the size of it and if it is spreading to lymph nodes 
or any other parts in the body. Stages of breast cancer 
are: 
 
• Stage0: known as ductal carcinoma which 
starts in milk ducts and do not spread to near 
tissues. 
• Stage1: in this stage the tumor will be in a size 
of 2 centimeters and did not effect on lymph 
nodes yet. 
• Stage2: the tumor is still 2 centimeters and 
started to spread to the near lymph nodes. 
• Stage3: the tumor increases to 5 centimeters in 
size and starts to spread to some lymph glands. 
• Stage4: the tumor started to spread to the far 
parts in the body like bones, liver, brain, and 
lungs. 
• Surgery: there are several types of surgery 
depend on medical diagnosis and consists of 
tumor excision with some nearby tissues 
specially id the tumor size is small. Second 
type of surgery is Mastectomy which means 
removing Lobes, milk ducts, adipose tissue, 
nipple, areola, and some skin and maybe some 
muscles from the breast. Another type is 
removing the lymph node, removing one node 
may cause the tumor to stop because if the 
tumor affects the lymph nodes it will spread 
all over the body [10]. 
• Radiotherapy: after one month of surgery, it is 
important to have Specific doses of 
radiotherapy to kill any cancer cells, woman 
need from three to five courses in a week for 
three to six weeks. 
• Chemotherapy: If there is a high risk of cancer 
recurring or spreading, your doctor may 
prescribe cytotoxic drugs. This is called 
adjuvant chemotherapy. If the tumor is large, 
the doctor may resort to chemotherapy before 
surgery to reduce the size of the tumor and 
ease its removal.  hormonal therapy: Doctors 
resort to hormonal therapy to prevent 
recurrence of hormone-sensitive breast 
cancers. This type of treatment is usually used 
after surgery, but it may be used before it to 
help shrink the tumor to make it easier to 
eradicate. 
 
4 Related works  
Detection of breast cancer is not an easy work because 
it passes by several important steps, primary of them 
detecting the tumor place and size using classical image 
processing or supervised training, to increase the system 
accuracy some studies used image processing and 
conventional neural networks together. For example, 
Setio et al. in [7] used to extract all the features from 
different directions of a 3D chest CT scanning and 
diagnosis the medical situation using a CNN network. 
Ross et al.[8] converted the 3D image into 2D image 
where every image scattered into patch which called 
2.5D view, this 2.5D view was fed into CNN to detect 
early cancer features which increases the accuracy of 
Detecting Breast Cancer in X-RAY Images Using Image…                                                             Informatica 47 (2023) 1–10         3 
the system. In [9] the study tries to use the transfer 
learning because there are several techniques faster than 
CNN but they suffer of accuracy like support vector 
machine (SVM). In medical imaging CNN needs a 
primary training with a special look from doctors and 
experts so we can make independence of this network 
using the transfer learning. Transfer learning suffers of 
the limitation of its ability to define between medical 
images and human organs. In study [10] the researcher 
used the Fractal dimension (FD), this way can define 
the different between cancer lumps from malignant to 
benign because of its geometry so in this condition 
using fractal geometry is fine. In this study the searcher 
used FA and SVM together with Box Count Method 
(BCM) which got the ability to catch the best results in 
respective sector. Then they used BCM to extract 
features. The FD features was got from 42 images in a 
dataset and processed with the SVM classifier to 
distinguish between malignant to benign. The accuracy 
of this system was 98.13%.Study [11],[12] was about 
finding the small and tinny tumor to predicate breast 
cancer in early conditions, which can reduce death rates, 
woman death is between 59 to 69 because of breast 
cancer. Here we have a challenge to detect the tumor 
lumps because of changes in tissue density in 
mammography pictures. This research used BC 
detection and study of early diagnosis of C using 
mammographic images taken by 3D MRI and the 
classify was done using machine learning. 
 
Table 1:  An interview for previous studies used for 
early breast cancer detection methods. 
 
Method Dataset Images Propose Accuracy 
De Cafe 
Model 
Break His for 
cancer detection 
7909 Deep features for 
breast cancer 
Histopathological 
Images categorization 
80-85% 
SVM,PNN
,MLP 
6 different 
datasets 
7273 Detection and 
diagnosis system 
99.7% 
CNN Needle biopsy 
microscopy 
images 
500 Computer-aided 
diagnosis of breast 
cancer 
96-100% 
Deep 
learning 
Pre trained 
conventional 
neural network 
927 Diagnose breast 
cancer in MRI images 
95% 
CNN-
based 
model 
Break His 
,breast cancer 
classification 
challenge 2015 
dataset 
7909+
43707 
Classify H&E stained 
breast biopsy images 
77.8-
83.3% 
CSDCNN 
based 
approach 
Break His 
dataset 
7909 Image and patient 
level classification 
93.2% 
 
As shown in table (1) all the methods used has a low 
accuracy that is needed to early cancer detection. This 
research aims mainly to use preprocessing algorithms to 
increase the accuracy of the neural networks in breast 
cancer detection that can help to decrease the danger of 
cancer situations on females. 
Pre-processing of data and combining classifiers with 
neural networks can help the networks to focus on the 
most important features than looking just into the whole 
image which is a gray scale image. Using classifiers can 
help too to detect a smaller area that can change the 
decision of the stage females can got in an X-RAY 
image.  
5 Breast cancer detection system 
structure 
Our system passes across multiple stages which are: 
• Reading and resizing images: as we are 
reading various images types which 
processed in MATLAB as matrixes, these 
images must be in the same size to multiply 
with masks or to be passed to a neural 
network or classifier. 
• Thresh-holding the sized image because we 
do not want the complex textures, the image 
will be converted to a binary image to extract 
cancer parts in the image from the 
background. 
• Using wavelets to extract edges which have 
the maximum sharp value which define the 
cancer position and the important area around 
it. 
Create geometric matrix to create some biometric 
calculations can define the different values of cancer 
• Load the train set and train the neural 
network. 
After resulting all the features, it will be connected in 
the database with the suitable therapy, the first aim 
using feature extraction is speeding the processing 
operations so we don’t need to compare input image 
with all images which may be time and memory 
consuming. Here we just extract features and choose the 
points the most important in an image and compare it 
with saved data which will increase the speed of the 
system. 
Figure (1) shows the main steps used in this research for 
breast cancer detection: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
              Figure 1: Steps of proposed method 
The main metric values we will calculate are: 
• Contrast: it is a standard shows a ration can be 
shown for the tones of the images, this ratio 
Read Images 
Threshold-binary image  
Wavelet transforms 
Extract features 
Split features dataset to train-test 
Train neural network 
4   Informatica 47 (2023) 1–10                                                                                                                                             R.T. Gdeeb 
can differentiate all the image land marks like 
textures, highlights, shadows, colors and 
clarity in a photograph. 
• Correlation: this standard is used just for non-
contact process which works fine with flexible 
materials.  
• Energy: this standard is used specially to 
measure any changes in the image on a local 
position. It is the change of one or more of the 
image specifications like color, brightness and 
magnitude of all the pixels in a local area. It 
works fine with the edges in the images which 
can help with image compression as example 
which are the hardest position to be 
compressed. Edges refer to a gradient of color 
over all the image. 
• Homogeneity: it is a statistical standard which 
means the most frequently colors can be got to 
gather together in single populations which are 
identical.  
• Mean: it is the average of a set of values, it is 
used in a wide range in mathematics and 
statistics, this mean computed in various way 
as the work and the result required, it can be a 
simple arithmetic mean, a geometric mean, and 
the harmonic mean. 
• Standard deviation: it is an important standard 
used to compute similarity. The standard 
deviation computes the number of variations of 
a set of values. If the values are close to the 
mean that means we can call it an expected 
value of the set. High values of this standard 
mean the values are spread over a wide range. 
• Mathematically, the standard deviation of a 
sample or a number, a statistical population, 
and a dataset is the square root of its variance. 
• Entropy: it is a mathematical calculation that 
defines how affecting transmitting data across 
a noisy channel. It can quantify all the 
information as a random variable, and it make 
the calculations using probability values. 
• It is used on a wide range in deep learning and 
machine learning where it can perform features 
selection, building decisions trees, and fitting 
the classification models, which are needed too 
much for machine learning. 
• RMS: means the square root of the mean 
square.  
• Variance: it is a statistical indicator of how far 
the numbers are in a dataset and how they 
spread in a special space, it defines how far are 
the points from the mean of the subset that 
point belongs to. It is used by both analysts and 
traders to determine volatility and market 
security.  
Smoothness: this standard is calculated to estimate the 
smoothing of data, it is called to the weighted averages 
of observation, smoothing is used to decrease the 
random negative and 
 
• positive values effects by inserting them in 
partially offset each other. 
• Kurtosis: it is a statistical measure used to 
describe distribution of data points in a dataset. 
When we have low kurtosis state then we can 
find that the Distributions are less extreme than 
the tails of the normal distribution. In another 
hand, if we have a high kurtosis that means we 
will face occasional extreme returns (either 
positive or negative),  
• Skew ness:  it is a measure of how the 
distribution is symmetric. We will have a 
symmetric condition when the positive and the 
negative sides are mirrored. This skew ness 
will be zero when it is stacked in the center. 
The first and main step for this algorithm is 
performing an image segmentation, this means 
extracting specific color levels from the image. This 
task begins with filtering an image to remove noise and 
extracting edges from the image then performing the 
Gaussian mixture model that can isolate areas infected 
inside the image with more accuracy, dividing the 
image into two important areas that are edge-split. This 
result is passed then into wavelet transform that can 
define edges more accurate especially for small areas. 
Mixing GMM with wavelet transform enhance 
detecting of small edges that are important to detect 
cancer. Detecting very small edges can help to define 
which stage is the disease is that can led for another 
cure task. Figure (2) shows steps of image 
segmentation. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
            Figure 2: Image segmentation task 
 
 
  
Threshold image 
Perform 
Adaptive filter 
GMM (Gaussian 
Mixture Model) 
Wavelet 
Transform 
Store extracted 
edges values  
Detecting Breast Cancer in X-RAY Images Using Image…                                                             Informatica 47 (2023) 1–10         5 
5.1 The median filter  
Pulsed noise is one of the most common types of noise 
that is found in medical images due to the frequencies 
on which medical images work. This noise is also 
known as salt and pepper, which results from unstable 
voltage values. This noise causes fixed pixel values that 
may be white with a value of 255 or black with a value 
of 0. [13] The relationship expressing this noise can be 
written as see in Eq. (1). 
 
𝑥 𝑖 = {
0           𝑤𝑖𝑡 ℎ 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑝 𝑛 255       𝑤𝑖𝑡 ℎ 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑝 𝑝 𝜑 𝑖      𝑤𝑖𝑡 ℎ 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 1 − (𝑝 𝑛 + 𝑝 𝑝 )
                
(1) 
  
  
Where 𝑥 𝑖 the distorted pixel in the image and 𝜑  is the 
value of the pixel and  𝑝 𝑛  , 𝑝 𝑝 probability of pixel 
affected with salt and pepper which is half of noise 
value between 0 and 1. Filtering algorithm may be 
linear or nonlinear. In linear form, we add the filter to 
all pixels without looking if the pixel noisy or not so it 
is not active and result effects on noisy and non-noisy 
pixels. In another hand, non-linear filtering we have two 
stages in filtering, first we detect the noisy pixel and 
then we filter the noisy one and keep save pixels values. 
One of these filters is the average filter which takes the 
average of pixels and change the noisy pixel with its 
value. It can remove the salt and pepper noise and keep 
all the edges save. 
There are several types for this filter like standard 
median filter (SM), and another modified center 
weighted median filter (CWMF), tri-state median filter 
(TSMF, progressive switching median filter (PSMF), 
and adaptive progressive switching median filter 
(APSMF)[15]-[17]. 
To evaluate these filters performance, we need to 
calculate (PSNR) as seen in Eq. (2) and Eq. (3).: 
 
𝑃𝑆𝑁𝑅 = 10 ∗ 𝑙𝑜𝑔
10
255
2
𝑀𝑆𝐸
                    (2) 
 Where: 
𝑀𝑆𝐸 = 
1
𝑀𝑁
∑ ∑ [𝐼 (𝑖 , 𝑗 ) − 𝐼 (𝑖 , 𝑗 )
̅ ̅ ̅ ̅ ̅ ̅ ̅
]
2 𝑁 𝑗 =1
𝑀 𝑖 =1
                (3) 
 
Where   M, N the size of image and 𝐼  pixels of main 
image and 𝐼 ̅
 pixels of filtered image. 
 
5.2 Adaptive Median Filter  
This type of filters can detect and remove noise too, the 
window is an adaptive one so we can increase the size 
of the window if some conditions are not met, the the 
condition met then filtering will be done using the 
normal size of the window [18]. 
 
 
 
5.3 Gaussian mixture model  
This classifier is a predictive classifier with than one 
feature for each data point, this algorithm depends on a 
classification way like that in K-means. Equation 
defines how this mixture works is seeing Eq. (4) and 
Eq. (5): 
 
𝑓 (𝑥 ) = (1 − 𝜋 )𝑔 1
(𝑥 ) + 𝜋 𝑔 2
(𝑥 )  (4) 
 
 And the Gaussian mixture: 
𝑔 𝑗 (𝑥 ) = 𝜑 𝜃 𝑗 (𝑥 ), 𝜃 𝑗 = (𝜇 𝑗 , 𝜃 𝑗 2
)  (5) 
see in Figure 3  
 
 
Figure 3: Gaussian mixture class decision 
making. 
 
These left plots show a density of two Gaussian 
functions 𝑔 1(𝑥 ), 𝑔 2(𝑥 ) with blue and orange colors and 
the green point 𝑥 = 0.5 defines the class of it. The right 
plot shows the referenced densities which called the 
responsible objects and defined in in Eq. (6) and Eq. 
(7): 
 
𝑔 1(𝑥 )
𝑔 1(𝑥 ) + 𝑔 2(𝑥 )
⁄  (6) 
𝑔 2(𝑥 )
𝑔 1(𝑥 ) + 𝑔 2(𝑥 )
⁄  (7) 
 
These objects are responsible of every cluster for the 
reference point, in top plot the standard deviation is 1 
and 0.2 in the bottom. The EM uses these objects to 
classify and soft assignment to the two clusters. When 
the standard deviation has a high value, these objects 
will equal to 0.5, but when the standard deviation gets 
close to zero these responsible objects will become one 
and the points will move to the center of the class 
belongs to and became zero to another clusters [19] 
. 
5.4 Wavelets transform  
When we look to the images, we can find it consists of a 
group of connected areas with likely structure and a 
group of gray levels gathered together to create objects. 
If these objects are small or have a low constraint, we 
need to examine it in a high accuracy level, in same 
6   Informatica 47 (2023) 1–10                                                                                                                                             R.T. Gdeeb 
way, if these objects are big or has a high constraint, we 
need a sharp show for It [20],[21].  
 
5.5 Artificial neural network 
Artificial Neural Network was created to let computers 
to think and act like human brain does. So, the network 
can understand the working flow and makes decisions 
[12-17]  
In the case of human brain, we can calculate up to 
nearly 1000 billion and each one has an association 
point somewhere with the nearby neurons. The human 
brain stores data as a distributed system and each cell 
has a part of this data. When we need to recover it then 
each cell sends a part of it to be collected so that it is a 
parallel processing system. Figure 4.shows the three 
main layers the neural network consists of. 
 
Figure 4: Layers of a neural network 
 
The main three layers in any ANN are: 
Input Layer: 
This layer is the first layer and has to accept all the data 
entered by the programmer, this data may be of any 
type and form, binary data or float or integer…etc. 
Hidden Layer: 
Has the job of connecting between the inputs and the 
outputs, all the calculations the neural network does are 
in this layer so it can change weights and learn the 
pattern or the features of the inputs. 
Output Layer: 
The last part of the ANN which has the job of resulting 
the output depending on the hidden layer calculations. 
Any neural network has a transfer function used to 
calculate the output as mathematical equation by 
multiplying each input with its weight and adding a 
constant called the bias. Eq. (8) defines this 
mathematical task. 
(8) 
 
This weighted input will define which output node will 
fire a result as its importance in the network at all, the 
fire operation needs an activation function and there are 
several types of activation functions like sigmoidal 
function the most famous and used functions. 
ANN are widely used because of its advantages and the 
accuracy of its results; the most important advantages of 
ANN are: 
• The ANN have the ability to work as parallel 
processing system that can perform more than 
one task in given time. 
• The ability of sharing and saving data in all the 
network and if one part of this data disappear 
the network keep working. 
• The ANN has the Capability to work with no 
requirement of pre-information about whole 
situation. 
• The ANN has the ability to remember all the 
situations it trained to so it must feed with all 
the examples required or may it give a wrong 
output. 
Just as artificial neural networks have advantages, it 
suffers from some disadvantages which are: 
• We do not have a global view of the network 
structure, we just have to feed it with the inputs 
and verify the output by experience, trial, and 
error [18-20]. 
• The ANN do not give us how it produces the 
output or why, it just gives a testing solution. 
• ANN needs a good hardware computing 
system with parallel processing. 
• The ANN is difficult to show the hidden and 
issue states because of numerical data dialing 
with. 
After entering inputs, each input will be multiplied with 
its corresponding weight, these weights may change or 
not corresponding to the activity of the neuron and the 
relation with other neurons. 
These networks need a bias factor to keep the weighted 
sum not zero, so it is added to each neuron calculation 
task. Bias can scale up the system response. Usually, 
the bias weight is equal to 1 and inputs of the network 
can vary from 0 to positive infinity. Then the output of 
this stage is passed to activation function. As shown in 
Figure 5. 
For controlling the network to get the desired output we 
need to use an activation function. Activation functions 
may be linear or non-linear. The most used transfer 
functions are Tan hyperbolic sigmoidal activation 
functions. 
In binary function, the output has just two values, 0 or 
1, we get this value by using a threshold value. Any 
value above this threshold will be 1 else it will be 0. 
The Sigmoidal Hyperbola function is one of the musts 
used functions and has a shape of curved "S". The 
mathematical function for this activation function is 
defined as in Eq. (9): 
 
F(x) = (1/1 + Exp (-ax))  (9) 
Where a is a constant defines how step the function is 
working. 
 
Detecting Breast Cancer in X-RAY Images Using Image…                                                             Informatica 47 (2023) 1–10         7 
 
Figure 5: Working manner of a neural network 
 
6 ANN types 
There are several types of artificial neural networks as 
the tasks required of these networks to do. We have the 
perceptron neural network, Adeline and Madeline 
neural networks and the most recent up to date 
conventional neural networks and recurrent neural 
networks. Whatever the type of the ANN they are 
classified unto two main classes which are[22-23]:   
• Feedback ANN: this type of network uses the 
output computed to generate an error signal to 
verify how the ANN works so the network will 
give us the best results. This network first 
created by university of Massachusetts and 
used to solve the optimization problems for 
atmospheric research   
• Feed-Forward ANN: this is the classical neural 
network one input and output layers and one or 
more hidden layers, this type of networks 
suited to work fine with pattern recognition 
systems with high accuracy results. The Tasks 
of activation functions 
It’s   a traditional function that tells the neuron what is 
the required output, like true or false, yes or no. this 
activation function maps the output of the neural 
network between several types of outputs, may be in 
the range of [0,1] or [-1,+1].  
• We have two main types of transfer functions, 
linear and non-linear activation functions. 
 
7 Results and discussion 
First of all, we need to read the wanted image and 
filtering it with adaptive filter for the three colored 
layers of the image, we need to change the image to 
uint8 form so the gray values will be between 0 and 255 
level. 
After that we will run GMM, we choose two regions 
(affected with cancer or not), and two GMM 
components, we choose the number of iterations for 
GMM equals to 10, and then we multiply the image to 
increase its light levels. 
We will then extract the features of the image using 
“glcm” command.  
Network used in this research was a perceptron neural 
network with 13 inputs that accepts the features 
extracted from each image, and has 3 outputs that 
responds for one of the three main states of the cancer 
which are Benign, Malign, and normal. Where the 
neural network has one hidden layer. The weights 
initialized to be close to 0 to avoid the problem of 
wrong values can be achieved after training. 
Figure 6 shows a flowchart of ANN used in this 
research: 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 6: Classify using neural network 
 
The model used was a new model, no pre-trained 
models were used for the task of cancer detection. There 
are several studies that aims to use conventional neural 
networks to detect breast cancer. But these systems 
need more computation power and more time for 
training compared with image processing and ANN 
methods. 
Next step is training the model on the images in the 
dataset. The dataset consists of 200 images of each class 
which are Benign, Malign, and normal split into 3 files 
each of them was labeled as 0,1,2 for each class. Images 
of the dataset were saved with the form of gray scale 
color with the accuracy of 8-bit color (means the gray 
scale is between 0 and 255). This dataset was taken 
from Kaggle website (http://www.kaggle.com). We will 
feed the features resulted after changing it to a vector to 
a neural feed forward network this will result the index 
of the cancer in all images. 
Figure.5 shows an X-RAY image of a breast, in this 
image we can see that the background is black and the 
tissue is gray while the cancer position is in white color. 
This image needs to be enhanced to detect the cancer 
area using the adaptive median filter. This is clear in 
Figure.6 and Figure .7.  
The proposed system classified our dataset into three 
classes, the dataset studied consist of 20 images of each 
class, the system trained using GMM and classified 
Features 
extracted 
Neural Network 
input (13 neurons) 
Hidden layer (20 
Neurons) 
Output layer (3 
neurons) 
Classification  
8   Informatica 47 (2023) 1–10                                                                                                                                             R.T. Gdeeb 
with k-means and SVM. The system performance was 
stated using accuracy, sensitivity, specificity, and 
another statistical calculation defines in Eq. (10-15) 
below. 
 
 
 
Figure 6: A breast X-RAY image 
 
 
Figure 7: Using adaptive median filter 
 
 
Figure 8: Detecting cancer position using K-means and 
Gaussian mixture 
 
accuracy =
𝐶 𝑝 𝐺 𝑝 ⁄ +
𝐶 𝑁 𝐺 𝑁 ⁄
𝐺 𝑝 +𝐺 𝑁 ∗ 100  (10) 
Miss classification rate =
𝐶 𝑝 𝐺 𝑝 ⁄ +
𝐶 𝑁 𝐺 𝑁 ⁄
𝑁 ∗ 100 
 (11) 
sensitivity =
𝐶 𝑝 𝐺 𝑝 ⁄
𝐶 𝑝 𝐺 𝑝 ⁄ +
𝐶 𝑝 𝐺 𝑁 ⁄
∗ 100  (12) 
specificity =
𝐶 𝑁 𝐺 𝑁 ⁄
𝐶 𝑁 𝐺 𝑝 ⁄ +
𝐶 𝑁 𝐺 𝑁 ⁄
∗ 100  (13) 
false − positive ratio =
𝐶 𝑁 𝐺 𝑝 ⁄
𝐶 𝑁 𝐺 𝑝 ⁄ +
𝐶 𝑁 𝐺 𝑁 ⁄
∗ 100 
 (14) 
false − negative ratio =
𝐶 𝑝 𝐺 𝑁 ⁄
𝐶 𝑝 𝐺 𝑝 ⁄ +
𝐶 𝑝 𝐺 𝑁 ⁄
∗ 100 
 (15) 
 
where 𝐶 𝑁 , 𝐶 𝑝 counted positives and negatives, 𝐺 𝑁 , 𝐺 𝑝 
are global positives and negatives. Table 2. Show the 
results for the system. 
 
Table 2: Statistical measures of dataset 
Calc. database 
Accuracy 98.56% 
Miss classification rate 1.44% 
sensitivity 97.66% 
specificity 98.46% 
false-positive ratio 3.016% 
false-negative ratio 1.68% 
 
A false positive error, or false positive, is a result that 
indicates a given condition exists when it does not. For 
example, a pregnancy test which indicates a woman has 
cancer when she is not, and as shown in table (2) this 
error is a small error. 
A false negative error, or false negative, is a test result 
which wrongly indicates that a condition does not hold. 
For example, when a cancer test indicates a woman has 
no cancer, but she has. It is an important error that must 
be taken into mind. And the table (2) above shows that 
this error was too small which means the high 
performance of the model. 
8 Applications and Implications 
The method proposed in this research is an important 
way to define between several types of breast cancer 
stages. This is important for several reasons. The 
accurate detection of cancer edges used in this method 
gives more accurate split between stages of cancer, and 
then the decision might be varied between chemical aid 
or eradication which can make a difference. 
In addition of that, this method is easy to be used in any 
clinic or hosbital because the development of devices 
gives them the ability to give the output as a jpg image 
than can be passed into our application and helps to get 
an early detection of the cancer. 
The study's findings indicate that using a combination 
of expert judgment and machine decision-making can 
provide more accurate outcomes and assist clinics and 
doctors avoid making mistakes in their judgments. 
The main challenge of using this application is that the 
dataset could be larger with time that can cause for 
more learning time and more computation power. The 
bigger dataset the more accuracy can we have. The 
other challenge that some X-RAY images do not have 
JPG form but other forms that could not be used in our 
application. 
Detecting Breast Cancer in X-RAY Images Using Image…                                                             Informatica 47 (2023) 1–10         9 
 
9 Discussions 
Compared with previous studies, the proposed method 
gives a higher accuracy for detection and classification 
of the disease, showing the importance of merging 
classifiers with neural networks to enhance the total 
accuracy for a tumor detection system. This is because 
the classifiers can help the neural network to focus on 
the most significant and important features in the image 
resulted by the classification task. The research focus is 
also on the importance of the pre-processing stages 
depending on image processing like filtering and 
morphological operations that helps the system to give 
better isolation for the infected areas. The obtained 
accuracy of 98.56% can be further increased by using 
several types of classifiers like SVM or decision trees 
combined with more advanced up to date conventional 
neural networks (CNN’s). 
10 Conclusion and recommendations 
1- We recommend using the proposed system to 
work in hospitals as an automated auto cancer 
detection system because of the high accuracy 
and speed. 
2- We recommend encouraging the governmental 
establishments to the practical conversation 
into automated systems because of high death 
levels caused by medical analysis. 
3- We recommend using the system to early 
detection of breast cancer; this system is active 
so women can use it even at home after getting 
the cancer ray image. 
 
References: 
 
[1] S. Nanglia, M. Ahmad, F. Ali Khan, and N. Z. 
Jhanjhi, “An enhanced Predictive heterogeneous 
ensemble model for breast cancer prediction,” 
Biomed. Signal Process. Control, vol. 72, no. 
103279, p. 103279, 2022. 
https://doi.org/10.1016/j.bspc.2021.103279 
 
[2] R. S. Khudeyer and N. M. Almoosawi, 
“Combination of machine learning algorithms and 
Resnet50 for Arabic Handwritten Classification,” 
Informatica, vol. 46, no. 9, pp. 39–44, 2023, doi: 
10.31449/inf.v46i9.4375.  
 
[3] X. Zhang et al., “Extracting comprehensive clinical 
information for breast cancer using deep learning 
methods,” Int. J. Med. Inform., vol. 132, no. 
103985, p. 103985, 
2019.https://doi.org/10.1016/j.ijmedinf.2019.103985 
[4] Francis Effirim Botchey, Zhen Qin, Kwesi Hughes-
Lartey and Ernest Kwame Ampomah. Predicting 
Fraud in Mobile Money Transactions using Machine 
Learning: The Effects of Sampling Techniques on 
the Imbalanced Dataset. Informatica, 45: 45–56, 
2021. https://doi.org/10.31449/inf.v45i7.3179 
[5] A. A. Akinyelu, F. Zaccagna, J. T. Grist, M. 
Castelli, and L. Rundo, “Brain tumor diagnosis 
using machine learning, convolutional Neural 
Networks, capsule Neural Networks and Vision 
Transformers, applied to MRI: A survey,” J. 
Imaging, vol. 8, no. 8, p. 205, 
2022.https://doi.org/10.3390/jimaging8080205 
[6] W. Al-Dhabyani, M. Gomaa, H. Khaled, and A. 
Fahmy, “Deep learning approaches for data 
augmentation and classification of breast masses 
using ultrasound images,” Int. J. Adv. Comput. Sci. 
Appl., vol. 10, no. 5, 2019. 
https://doi.org/10.14569/ijacsa.2019.0100579 
 
[7] Y. Hao, S. Qiao, L. Zhang, T. Xu, and Y. Bai, 
“Breast Cancer Histopathological Image 
Recognition Based on Low Dimensional’,” Three-
Channel Features, vol. 
11.https://doi.org/10.3389/fonc.2021.657560 
[8] Suleiman Ali Alsaif and Adel Hidri. Impact of data 
balancing during training for best predictions, 
Informatica, 45(2): 223–230, 2021. 
https://doi.org/10.31449/inf.v45i2.3479. 
 
[9] A. Saber, M. Sakr, O. M. Abo-Seida, A. Keshk, and 
H. Chen, “A novel deep-learning model for 
automatic detection and classification of breast 
cancer using the transfer-learning technique,” IEEE 
Access, vol. 9, pp. 71194–71209, 
2021.https://doi.org/10.1109/access.2021.3079204. 
[10] V. Azevedo, C. Silva, and I. Dutra, “Quantum 
transfer learning for breast cancer detection,” 
Quantum Mach. Intell., vol. 4, no. 1, p. 5, 
2022.https://doi.org/10.1007/s42484-022-00062-4 
[11] K. K. Dewangan, D. K. Dewangan, S. P. Sahu, and 
R. Janghel, “Breast cancer diagnosis in an early 
stage using novel deep learning with hybrid 
optimization technique,” Multimed. Tools Appl., 
vol. 81, no. 10, pp. 13935–13960, 
2022.https://doi.org/10.1007/s11042-022-12385-2. 
[12] A. Rasool, C. Bunterngchit, L. Tiejian, M. R. Islam, 
Q. Qu, and Q. Jiang, “Improved machine learning-
based predictive models for breast cancer 
diagnosis,” Int. J. Environ. Res. Public Health, vol. 
19, no. 6, p. 3211, 2022. 
https://doi.org/10.3390/ijerph19063211 
 
[13] L. Rh, B. Kujabi, C. Chuang, C. Lin, and C. Chiu, 
“Application of deep learning to construct breast 
cancer diagnosis model,” Appl Sci, vol. 12, no. 4, 
2022.https://doi.org/10.3390/app12041957 
[14] M. M. Alshammari, A. Almuhanna, and J. Alhiyafi, 
“Mammography image-based diagnosis of breast 
cancer using machine learning: A pilot study,” 
Sensors (Basel), vol. 22, no. 1, p. 203, 
2021.https://doi.org/10.3390/s22010203 
[15] A. Ghasemzadeh, S. Sarbazi Azad, and E. Esmaeili, 
“Breast cancer detection based on Gabor-wavelet 
transform and machine learning methods,” Int. J. 
Mach. Learn. Cybern., vol. 10, no. 7, pp. 1603–
1612, 2019. 
https://doi.org/10.1007/s13042-018-0837-2 
 
10   Informatica 47 (2023) 1–10                                                                                                                                             R.T. Gdeeb 
[16] M. Alruwaili and W. Gouda, “Automated breast 
cancer detection models based on transfer learning,” 
Sensors (Basel), vol. 22, no. 3, p. 876, 
2022.https://doi.org/10.3390/s22030876 
[17] L. Tsochatzidis and L. Costaridou, “Pratikakis Deep 
learning for breast cancer diagnosis from 
mammograms - a comparative study J,” J. Imaging, 
vol. 5, no. 37, pp. 1–11, 
2019.https://doi.org/10.3390/jimaging5030037 
 
[18] D. Abdelhafiz, C. Yang, and R. Ammar, “Nabavi 
Deep convolutional neural networks for 
mammography: Advances, challenges and 
applications BMC,” Bioinf, vol. 20, pp. 1–20, 
2019.https://doi.org/10.1186/s12859-019-2823-4 
[19] Y. Chen, T. Zheming, Z. Yang, and S. Holly, 
“Norford Transfer learning with deep neural 
networks for model predictive control of HVAC and 
natural ventilation in smart buildings J,” J. Cleaner 
Prod, vol. 254, pp. 1–10, 
2020.https://doi.org/10.1016/j.jclepro.2019.119866 
[20] A. Amyar, R. Modzelewski, and H. Li, “Ruan 
Multi-task deep learning based CT imaging analysis 
for COVID-19 pneumonia: Classification and 
segmentation Comput,” Biol. Med, vol. 126, 
2020.https://doi.org/10.1101/2020.04.16.20064709 
[21] K. Wei, B. Wang, and J. Saniie, “Faster region 
convolutional neural networks applied to ultrasonic 
images for breast lesion detection and 
classification,” in 2020 IEEE International 
Conference on Electro Information Technology 
(EIT), 
2020.https://doi.org/10.1109/eit48999.2020.9208264 
[22] M. Yusoff, T. Haryanto, H. Suhartanto, W. A. 
Mustafa, J. M. Zain, and K. Kusmardi, “Accuracy 
analysis of deep learning methods in breast cancer 
classification: A structured review,” Diagnostics 
(Basel), vol. 13, no. 4, 
2023.https://doi.org/10.3390/diagnostics13040683 
[23] M. Gour, S. Jain, and T. Sunil Kumar, “Residual 
learning based CNN for breast cancer 
histopathological image classification,” Int. J. 
Imaging Syst. Technol., vol. 30, no. 3, pp. 621–635, 
2020.https://doi.org/10.1002/ima.22403