https://doi.org/10.31449/inf.v45i4.3570 Informatica 45 (2021) 571–581 571 
 
Reduced Number of Parameters for Predicting Post-Stroke Activities 
of Daily Living Using Machine Learning Algorithms on Initiating 
Rehabilitation 
Ali Mohammad Alqudah and Munder Al-Hashem 
Department of Biomedical Systems and Informatics Engineering, Yarmouk University, Irbid, Jordan 
E-mail: ali_qudah@hotmail.com, munderalhashem@gmail.com 
 
Amin Alqudah 
Department of Computer Engineering, Yarmouk University, Irbid, Jordan 
E-mail: amin.alqudah@yu.edu.jo 
Keywords: Barthel Index scale (BI), Activities of Daily Living (ADL), stroke, features selection, Machine Learning 
(ML) 
Received: June 01, 2021 
The estimation of the Barthel Index scale (BI) is a significant method for measuring the performance of 
Activities Daily Living (ADL), where the prediction of ADL is crucial for providing rehabilitation care 
management and recovery for patients after stroke, therefore in this paper, nine various Machine 
Learning (ML) algorithms were implemented in a medical dataset contains 776 records from 313 patients  
208 of them are men: 208 and 150 are women with multiple features collected from them for predicting 
and classifying the BI status as clinical decision support for determining the ADL of post-stroke patients. 
Meanwhile, we have applied feature selection using the chi-squared test to reduce the number of features 
in the dataset. The results showed that the Decision Tree (DT), XGBoost (XGB), and AdaBoost (ADB) 
classifiers performed the highest performance achieved with 100% correctness in terms of accuracy, 
sensitivity, specificity, error, and Area Under Curve (AUC) on both the full and reduced features datasets. 
Povzetek: V prispevku je predlagana metodologija za zmanjšanje števila parametrov za napovedovanje 
primernih aktivnosti po možganski kapi. 
 
1 Introduction
A stroke is a medical condition that occurs when the blood 
flow to the brain`s parts is reduced or interrupted and 
frustrating the brain tissues from receiving nutrients and 
oxygen which leads to dying the brain cells in few minutes 
[1]. Stroke is a dilemma that needs urgent care and 
attention from around. Many people in the whole world 
are suffering from stroke and almost two-thirds of those 
individuals survive but need rehabilitation and recovery. 
Patients with long-standing disabilities after stroke could 
face many load difficulties in their life living whether 
physical, society, family and mental, therefore, the main 
object of rehabilitation and recovery is by observing the 
patients functions after a stroke and monitoring the level 
of independence to achieve the greatest potential 
Activities of Daily Living (ADL) [2]. 
Predicting the ADL is pivotal and crucial for effective 
use and careful deal with patients after stroke especially in 
the first months. Furthermore, ADL provides an overview 
of how the person`s independence in its life and the 
sufficient support and the health care provided by both the 
government and patient’s family [2]. For example, in 2016 
only every 40 seconds in the USA only there is a new onset 
of stroke events appeared [3], meanwhile, in 2012 in 
Taiwan there was around 230 patients were admitting 
hospital every day due to the occurrence of acute stroke 
[3]. Due to this, if the disabilities and impairments 
remained for a long time in patients whose affected by a 
stroke this may lead to a massive mental, physical, and 
also a heavy financial load for the patients and their 
families and this all can be override by just make a 
rehabilitation for stroke patients as fast as possible [3]. The 
main target of rehabilitation the stroke patient is to ensure 
that they returned to their life after sophisticated training 
of them to make them able to make activities of daily 
living (ADL) independently. Moreover, the first stage of 
rehabilitation is to accurately predict ADL because the 
accurate prediction of ADL is vital to make optimization 
for stroke management in the first two to three months 
following a stroke attack [3]. 
During the rehabilitation process of the patient with 
strokes, different assessments and parameters should be 
checked for patients of Post-acute Care-Cerebrovascular 
Diseases (PAC-CVD) program to provide important and 
detailed information about the patient’s health from 
various aspects [3]. The collected parameters include 
Barthel index (BI), modified Rankin scale (MRS), 
functional oral intake scale (FOIS), mini nutrition 
assessment (MNA), instrumental activities of daily living 
scale (IADL), Berg balance test (BBT), gait speed, 6-min 
walk test (6MWT), Fugl–Meyer upper extremity 
572 Informatica 45 (2021) 571–581 A. M. Alqudah et al. 
 
assessment (FuglUE), modified Fugl–Meyer sensory 
assessment (FuglSEN), mini-mental state examination 
(MMSE), motor activity log (MAL), and concise Chinese 
aphasia test (CCAT). Using these multiple assessment 
parameters (features) we can generate and predict more 
comprehensive data and a conclusion about the current 
statuses of patients than single assessment features. 
However, providing an accurate tool for the prediction of 
ADL is not available due to the contribution of other 
features on the prognosis of ADL. To solve this problem, 
machine learning (ML) systems can be used 1. 
ML is considered as an application of Artificial 
Intelligence (AI) that provides the ability to automatically 
learn and improve from the experienced learned. ML 
directly focuses on building a mathematical model to be 
used in the prediction process based on patterns extracted 
from the experienced learned data whether to be a large or 
complex dataset. Latterly, ML algorithms are extensively 
used in real-life applications and the healthcare field. The 
Healthcare field is directly related to human life that`s 
because using the ML will intensify, enhance, improve, 
and reduce the error level of rates in the medical diagnosis 
[1, 2, 3]. This research presented several ML algorithms 
for predicting the ADL after the stroke of patients to give 
information about the person`s independence based on 
multiple features by predicting BI score. Where the BI 
score is a widely used and very well-known parameter and 
tool to assess the independence of functions of ADL 
especially with post-stroke patients. To predict apply 
machine learning we use a dataset of 18 parameters 17 of 
them as features and the last one represents the BI score 
where the dataset contains 776 records from 313 patients 
208 of them are men: 208 and 150 are women. We used 
nine different ML algorithms which are as follows; 
Logistic Regression (LR), K Nearest Neighbor (KNN), 
Support Vector Machine (SVM), Naive Bayes (NB), 
Decision Tree (DT), Random Forest (RF), XGBoost 
(XGB), AdaBoost (ADB), and Artificial Neural Network 
(DNN). The algorithms are estimated by using training 
and testing methodology and the performance evaluation 
is calculated utilizing the accuracy, sensitivity, specificity, 
error, Receiver Operating Characteristic (AUC), and 
confusion matrix. 
This paper is organized as follows: Section II 
describes the related work; Section III explains the 
materials and method of the post-stroke dataset used in our 
study and various ML algorithms applied to the multiple 
features. In section IV we present the results of the 
suggested methodology; Section V represents a discussion 
of the results of the proposed research. Finally, we 
conclude the paper in Section VI. 
2 Literature review 
In this section, the most recent works related to the topic 
of stroke rehabilitation prediction using machine learning 
are discussed.  Janne et al [2] presented a study on the 
early post-stroke stage for final ADL to recognize the 
variables and the outcome of Activities of Daily Living 
(ADLs) after stroke. The method was by distinguishing 
the high and low quality for the risk of bias and qualitative 
synthesis. The results showed that the median risk of bias 
was 17 out of 27. 
Yin Lin et al [3] proposed a method for predicting the 
activities of daily living (ADL) of post-stroke patients. 
Many Machine Learning (ML) algorithms are used such 
as Logistic Regression (LR), Support Vector Machine 
(SVM), Linear Regression, and Random Forest (RF) to 
estimate the Barthel index (BI). The results showed the RF 
and LR were higher for the Area Under the Curve (AUC) 
with 0.79 rather than the SVM algorithm with AUC of 
0.77. Moreover, the Mean Absolute Error (MAE) of both 
linear regression and SVM was 9.95 and 9.86, 
respectively. 
Chu Pai et al [4] proposed a study of testing and 
predicting the time change of Activities Daily Living 
(ADLs) of post-stroke patients within 24 hours and the 
whole months of the year. The ADLs were estimated by 
the Barthel Index (BI). A latent growth curve model was 
utilized to analyze the dynamic variations in ADLs. The 
results showed that the time following a stroke increases, 
survivors attend to continuously upgrade with interest to 
ADLs. Moreover, the below levels of the primary ADLs 
were related to the larger growth of ADLs over time. 
Douiri et al [5] developed a prediction monitoring 
model for the post-stroke patient's recovery and estimated 
its clinical use. Activities of Daily Living (ADLs) were 
evaluated utilizing the Barthel Index (BI) for many weeks 
after the stroke. Penalized linear blended models were 
generated to predict the functional recovery of patients. 
The results of the prediction recovery curve showed good 
accuracy for Root Mean Squared Deviation (RMSD) of 3 
BI points to one year. The passive predictive values of risk 
poor recovery with BI less than 8 at three and 12 months. 
Bertolin et al [6] presented a study of predicting acute 
stroke patients after mild to moderate. Many predictor 
variables were measured for predicting the ADLs. The 
results showed physical variables described more variety 
in ALDs than the communication, memory, and thinking 
results. The Short-Blessed Test (SBT) was the unique 
meaningful independent foreteller of communication, 
memory, and thinking, while the National Institute of 
Health Stroke Scale (NIHSS) was the only one that 
measures safely predicted ALDs. 
Glalanella et al [7] presented a study on investigating 
the Functional Independence Measure (FIM) of ADLs as 
potential predictors of the outcomes after stroke through 
inpatient rehabilitation. Regression analysis was used 
through Two backward stepwise to estimate the most 
independent variables. The results showed that social 
interaction, grooming, upper body dressing, and bowel 
control are the most independent and important variables 
for reviewing the FIM of ADLs. 
Lee et al [8] proposed a study for developing a 
computational method to distinguish the potential of 
predicting the Quality of Life (QOL) after post-stroke 
rehabilitation. Five classifiers were used by using personal 
factors and nine functional issues to estimate the QOL. 
The results showed that the Particle Swarm-Optimized 
Support Vector Machine (PSO-SVM) gives the most 
extraordinary accuracy (58.33%), the highest cross-
Reduced Number of Parameters for Predicting Post-Stroke Activities of... Informatica 45 (2021) 571–581 573 
 
validated accuracy (74.29%) which results as the best 
classifier in predicting the QOL of stroke patients. 
3 Materials and methods 
In this section, we will explain the dataset used and our 
recommended methodology, then making a comparison 
with the various classifier’s performances. 
3.1 Materials 
In this paper, the data proposed in paper [3] has been used. 
The dataset contains 776 patient’s data and 15 features. 
The features contain the following gender, age, acute ward 
stay, LOS, BI admission, MRS, FOIS, MNA, IADL, BBT, 
6MWT, FugIUE, MMSE, MAL Quality, BI discharge. 
These values are collected from patients and based on 
them the output (Class) is determined. Table 1 shows the 
features and class values for the used dataset. collected 
from the patient based on the following criteria: 
1 Stroke onset time must be within the last month. 
2 Hemodynamic parameters should be stable 
within the last 72 hours. 
3 No neurological deterioration within the last 72 
hours.  
4 Sufficient cognition function and ability to learn 
rehabilitation exercise (Especially MRS between 2 and 4). 
While they mentioned that the exclusion criteria for 
the patient include: 
The dataset authors mentioned that the features are  
1. Stroke onset time> 1 month. 
2. Patients with end-stage renal disease or those 
receiving dialysis therapy. 
After applying these requirements above, every three 
weeks before discharge all qualifying patients receive a 
detailed post-stroke admission status test. The ratings 
contain the following characteristics (MRS, BI, FOIS, 
MNA, QoL, IADL, BBT, gait speed, 6MWT, FuglUE, 
FuglSEN, MMSE, MAL, and CCAT). During the 
recovery ward, both tests were used to illustrate the gravity 
of the BI status or the BI score forecasts for the 
rehabilitation unit discharge. Besides, an explanatory 
declaration is made of patients' age and longevity in the 
acute stroke ward before referral to PAC-CVD features 
[3]. The summary of the dataset is shown in Table 2. 
3.2 Methodology 
The suggested study as shown in Figure 1 below is by 
firstly solving the problem of multiclass of BI score, then 
applying nine classification algorithms and make a 
comparison between them to predict the BI score which 
represents the ADL of stroke patients. 
3.3 Data preparation 
Based on the BI at discharge from the PAC-CVD ward, it 
shows that the BI score is categorized with different 
values. The BI score with a value of 20 represents as 
independent ADL, the BI score with a value range of 15 - 
19 represent mild dependent ADL, and the BI score with 
a value range of 10 - 15 represent as moderate dependent 
ADL. Therefore, the multiclass classification problem of 
BI score can be solved by discretizing the BI into 3 ordinal 
classes. 
3.3.1 Features selections 
In this paper for results enhancement and time reduction 
in future patient BI score prediction, we will perform a 
Chi-squared test for features selections. The Chi-squared 
test is well known statistical test of independence that is 
usually used to test and determine whether two categorical 
variables are independent or not. Features selection 
concept using this test is very simple where we find if the 
feature is independent of the target or not. If the feature is 
dependent, then the feature will likely not be useful in 
predicting the target and not selected otherwise if they are 
not independent the feature will most likely have 
predictive power on the target and selected [9]. 
3.3.2 Classification algorithms 
Classification in general is a plan that consists of orderly 
grouping something according to some features or 
standards. it is a type of supervised machine learning in 
which an algorithm "learns" to classify new observations 
from examples of labeled data and is used to develop 
models that predict what group or class that something 
belongs to [10-13]. Classification is a two-step process. 
During the first step, the model is created by applying a 
classification algorithm on a training data set then in the 
second step, the extracted model is tested against a 
predefined test data set to measure the model's trained 
accuracy. So, classification is the process to assign a class 
label from a dataset whose class label is unknown [13-16]. 
The following subsections will discuss the classifiers used 
in our paper. 
Support Vector Machine (SVM) 
Support vector machine (SVM) is a supervised learning 
algorithm under machine learning, and it is used for both 
classification and regression tasks. But most often the 
supporting vector machine is used in the classification. 
The Support Vector Machine works by looking at a set of 
training examples, and then marking each of them as 
belonging to one or another of two classes, the SVM 
training algorithm builds a model that maps new examples 
to one class or another, making it a non-probability binary 
linear classifier [10]. 
 
Figure 1: The block diagram of the proposed study. 
Data 
Preparation
Features 
Selection
Classification 
Algorithms
Performance 
Evaluation
Best Model 
Selection
574 Informatica 45 (2021) 571–581 A. M. Alqudah et al. 
 
Naïve Bayes (NB) 
Naive Bayes Classifiers are based on Bayesian 
classification techniques. This is based on Bayles’s 
theorem, which is an equation defining the relationship 
between the conditional probability of statistical 
quantities. In the case of the Bayesian classification, we 
are interested in finding the likelihood of a category 
considering some of the characteristics that have been 
observed. In general, Naive Bayes is a simple, but 
powerful and widely used machine learning classifier. It is 
a probabilistic classifier that allows classifications in a 
Bayesian setting using the Maximum A Posteriori 
decision law. It can also be interpreted by a very simple 
Bayesian network. Naive Bayes Classifiers are highly 
common for text classification and are a standard 
approach to problems such as spam detection. The naïve 
Bayes classifier does not need any parameters setup [11]. 
Decision Tree (DT) 
The Decision Tree algorithm is part of the family of 
supervised learning algorithms. Unlike other supervised 
learning algorithms, the decision tree algorithm can also 
be used to solve regression and classification problems 
[12]. The purpose of the Decision Tree is to create a 
training model that can be used to predict the class or value 
Table 1: Dataset Features, Range, and Description.  
Feature Range Description 
Gender Male: 1, Female: 2 Gender  
Age 35 -85 Years Age  
Acute Ward Stay 7-20 Days 
length of stay in 
acute stroke ward 
LOS 8-25days 
length of stay in 
rehabilitation ward 
BI Admission 
20 mean independent ADL, 15-19 mean 
mild dependent, 0-14 mean moderate 
dependent, and 5–9 mean severe dependent. 
Barthel index on 
admission to PAC-
CVD ward 
MRS 
0- 2 mean mild disability, and 3 - 5 mean 
moderate to severe disability. 
Modified Rankin 
Scale 
FOIS 
Range from 1 to 7 where Level 1 mean 
nothing by mouth to Level 7 which mean total 
oral diet with no restrictions. 
functional oral 
intake scale 
MNA 
<8 mean malnutrition, 8–11 mean risk of 
malnutrition, and >11 mean no malnutrition. 
mini nutrition 
assessment 
IADL 
0 mean low function (dependent), and 8 
mean high function (independent). 
instrumental 
activities of daily 
living scale 
BBT 
Range from 0 to 4 determining sitting and 
standing balance) 
Berg balance test 
Gait Range from 0.3 m/s to 1 m/s. gait speed 
6MWT Range from 210 m to 300 m. 6-min walk test 
FuglUE 
0 mean cDNNot perform, 1 mean 
partially, and 2 mean fully. 
Fugl–Meyer 
upper extremity 
assessment                      
Motor recovery 
FuglSEN 
0 mean cDNNot perform, 1 mean 
partially, and 2 mean fully. 
assess the 
sensorimotor 
impairment in 
individuals who have 
had stroke 
MMSE 
MMSE >=24 mean normal, MMSE =19-
23 mean mild, MMSE = 10-18 mean 
moderate, and MMSE <=9 mean severe. 
mini-mental state 
examination 
MAL Quality 
0 mean never, 1 mean poor, MAL =2 
mean fair, and 3 mean normal. 
Motor activity 
log, quality of use arm 
function 
BI Discharge 
20 mean independent ADL, 5-19 mean 
mild dependent, 10-14 mean moderate 
dependent, and BI =5–9 mean severe 
dependent. 
Barthel index at 
discharge from PAC-
CVD ward 
Class  
Patient 
Classification Value 
 
Reduced Number of Parameters for Predicting Post-Stroke Activities of... Informatica 45 (2021) 571–581 575 
 
of the target variable by learning basic decision rules 
derived from previous results (training data). In Decision 
Trees, to predict the class label for a record we start from 
the root of the tree. We compare the values of the root 
attribute with the record’s attribute. Based on a 
comparison, we follow the branch corresponding to the 
value and leap to the next node. Decision trees classify the 
examples by sorting them down from the root tree to some 
leaf/terminal node, with the leaf/terminal node giving the 
example classification. Each node in the tree serves as a 
test case for some attribute, and each edge coming down 
from the node corresponds to the possible answers to the 
test case. This process is recursive and is repeated for 
every subtree rooted in a new node [12]. 
Random Forests (RF) 
The Random Forest (RF) Classifier, first proposed by 
Breiman [13], is one of the most common classification 
tools and an excellent set of machine learning techniques. 
The key principle of the RF classifier is to construct a 
classification tree based on a few randomly selected 
features of randomly selected samples with a bagging 
technique. Designed trees are used to vote for an input 
vector to get a class name. RF classifiers are built by 
several simple learners, where each basic learner is an 
individual binary tree following recursive partitioning. RF 
has many advantages; it has better precision than other 
classifiers, allows large-scale data efficiency, does not 
bypass, and can be conveniently extended to multi-class 
inputs. The RF classifier has demonstrated superior 
classification efficiency over the other suggested methods 
since it was proposed [10, 11]. 
eXtreme Gradient Boosting (XGBoost) 
XGBoost is a Machine Learning algorithm based on the 
decision-tree, using a gradient enhancement method. 
Artificial neural networks tend to outdo all other 
algorithms or systems in prediction problems involving 
unstructured data (images, messages, etc.). In small to 
medium structured/tabular data, however, decision-tab 
based algorithms are now regarded as best-in-class. For 
advances in tree-based algorithms over the years please 
see the map below. At its inception, this algorithm has not 
only been regarded as the guiding force behind the hood 
but many leading-edge applications in the industry [14]. 
Adaptive Boosting (AdaBoost) 
AdaBoost is an ensemble learning system that was 
originally designed to improve the performance of binary 
classifiers (also known as "Meta-Learning"). AdaBoost 
uses an iterative approach to learn from and improve the 
Table 2: Dataset Features Statistical Description. 
Statist
ical 
Value 
Stand
ard 
Deviation 
Mean  
Mini
mum 
Maxi
mum 
25% 
Percentile 
50% 
Percentile 
75% 
Percentile 
Feature 
Age 
4.104
472 
13.840
21 
7 20 10 14 18 
Acute Ward 
Stay 
5.297
994 
16.559
28 
8 25 12 17 21 
LOS 
3.502
869 
14.592
78 
9 20 11 15 18 
BI Admission 
1.703
339 
2.5979
38 
0 5 1 3 4 
MRS 
2.016
674 
4.0631
44 
1 7 2 4 6 
FOIS 
4.036
867 
8.2719
07 
2 15 5 8 12 
MNA 
2.616
618 
3.8402
06 
0 8 1 4 6 
IADL 
1.409
733 
1.9677
84 
0 4 1 2 3 
BBT 
26.66
156 
254.35
7 
21
0 
30
0 
231 253 277 
6MWT 
0.809
418 
1.0180
41 
0 2 0 1 2 
FugIUE 
7.027
365 
18.752
58 
7 30 13 19 25 
MMSE 
1.104
851 
1.4497
42 
0 3 0 1 2 
MAL Quality 
3.092
184 
16.430
41 
11 20 14 17 20 
BI Discharge 
14.79
479 
60.333
76 
35 85 48 61 73 
 
576 Informatica 45 (2021) 571–581 A. M. Alqudah et al. 
 
errors of weak classifiers. AdaBoost is a common 
stimulation strategy, attempting to combine several weak 
categorizers to create a powerful category. A single 
classifier cannot reliably predict an object's class, but we 
can build up such a strong model by grouping several 
weak classifications with each of them eventually learning 
from the incorrectly categorized objects of the others. The 
above classifier may be any of your basic classifiers, from 
Decision Trees (often the default) to Logistic Regression, 
etc [15]. 
Deep Neural Network (DNN) 
A deep neural network (DNN) is an artificial neural 
network (DNN) with several layers between the input and 
output layers. There are various types of neural networks, 
but they are all made up of the same components: neurons, 
synapses, weights, beliefs, and functions. These elements 
are identical to the human brain and can be learned like 
any other ML algorithm [16, 17]. DNNs can model 
complex non-linear relationships. DNN architectures 
create compositional models in which the object is 
expressed as a layered composition of primitives. Extra 
layers permit the composition of features from lower 
layers, theoretically modeling complex data with fewer 
units than an equally powerful shallow network. The deep 
architecture contains a variety of versions of a few simple 
approaches. Different architectures have achieved success 
in particular fields. It is not always possible to assess the 
performance of different architectures unless they are 
tested on the same data sets [18]. DNNs are usually feed-
forward networks in which data flows from the input layer 
to the output layer without looping back. Next, the DNN 
maps simulated neurons and assigns arbitrary numerical 
values, or "weights," to the relations between them. The 
weights and inputs are multiplied, and the product is 
returned between 0 and 1 [19]. If a certain pattern was not 
correctly understood by the network, an algorithm would 
change the weights. This way, the algorithm will make 
those parameters more influential before it decides the 
right mathematical manipulation of the data to be 
completely processed [20]. 
Performance evaluation 
Evaluating the performance of machine learning 
algorithms is an essential part of any project. The model 
leads to giving you pleasing results when estimated using 
metric indices such as accuracy, specificity, sensitivity, 
error, and AUC [10-20]. Calculating these metrics by 
using True Positive (TP), False Positive (FP), False 
Negative (FN) and True Negative (TN) can the outputs of 
the proposed system compared to reference data. and 
consequently, the accuracy, sensitivity, precision, 
specificity, error, and AUC were evaluated as follows: 
𝐴 𝑐𝑐 𝑢 𝑟 𝑎𝑐 𝑦 =
TP + TN
TP + FP + TN + FN
 
𝑆𝑒𝑛 𝑠 𝑖 𝑡𝑖𝑣 𝑖 𝑡𝑦 =
TP
TP + FN
 
𝑆𝑝𝑒 𝑐𝑖 𝑓 𝑖 𝑐𝑖 𝑡𝑦 =
TN
TN + FP
 
𝐸𝑟 𝑟 𝑜 𝑟 = 100 − 𝐴 𝑐𝑐 𝑢 𝑟 𝑎 𝑐𝑦 
Best model selection 
For each ML algorithm that trains to the training dataset, 
the advantage of the learning algorithm here comes 
through the observed patterns of the training data that can 
do such as mapping to the input data attaches to the target, 
which is the required answer to be predicted, then creating 
ML model which makes capturing for these patterns. 
Moreover, the testing dataset is next implemented to the 
ML models to check and test their performances, but after 
comparing the results from the nine different algorithms 
that used, the most proper and appropriate model with the 
best accuracy, sensitivity, specificity, error, and AUC 
result is chosen to estimate and predict the BI as was 
discretized into 3 ordinal classes [10]. 
Table 3: Training and testing dataset division. 
Post 
Stroke 
Dataset 
Training 
set 
Testing 
set 
620 156 
 
Figure 2: Training and Validation Accuracy. 
 
Figure 3: Training and Validation Loss. 
 
Figure 4: Accuracy performance evaluation. 
98,71 98,71 100 99,35 100 100 98,71
60
70
80
90
100
SVM NB DT RF XGB ADB DNN
Reduced Number of Parameters for Predicting Post-Stroke Activities of... Informatica 45 (2021) 571–581 577 
 
4 Results 
The proposed methodology was employed with 80% - 
20% training and testing division on the post-stroke 
dataset as shown in Table 3. Nine different ML classifiers 
are used for classifying the digitizing BI score that 
represents the ADL. The multi-classification process was 
successful in predicting the BI score with a high level of 
performance. 
4.1 Features selection results 
Before starting training and testing the used classifiers 
with and without features selection method first we 
applied the features selection methodology and select the 
top five features and the worst five features. The top five 
features selected by Chi-squared test methodology were 
ordered as follows: BI Admission, BI Discharge, 6MWT, 
Age, and FOIS. While the worst five were ordered as 
follows: Acute Ward Stay, FugIUE, MRS, Gender, and 
MAL Quality. The top five features are only used when 
we used the features selected classifiers. 
4.2 Classifiers results without features 
selection 
The nine classifiers were implemented for predicting as 
follows; the SVM classifier was performed with a linear 
kernel function. Moreover, the RF classifier was 
implemented too with 5 trees in the forest. Finally, the 
DNN classifier was trained by utilizing the Adaptive 
Moment Learning Rate (ADAM) solver with an initial 
learning rate value of 0.001.  Figure 2 shows the training 
and validation process while Figure 3 shows the training 
and validation loss.  
The hyperparameters of SVM, RF, DNN classifiers 
were chosen based on the principle of GridSearchCV 
which is a library function that helps to loop through 
predefined hyperparameters and fit the model on the 
training set. So, in the end, it can select the best parameters 
from the listed hyperparameters. The other classifiers that 
were also used were NB, DT, XGB, and ADB. The results 
of all algorithms employed show an extremely high level 
of results based on all the dataset features used. Figure 4 
shows the results of accuracy, while Figure 5 shows the 
results for the error, where Figure 6 shows the results for 
the AUC, and finally, Figure 7 shows the results for the 
sensitivity and specificity of the three discretized classes 
that all obtained from the nine different ML algorithms. 
4.3 Classifiers results with features 
selection 
The same classifiers settings are performed on the selected 
features using the Chi-squared test where only five 
features are fed to the classifiers on the following results. 
Figure 8 shows the training and validation process while 
Figure 9 shows the training and validation loss.  
 
Figure 5: Error performance evaluation. 
 
Figure 6: AUC performance evaluation. 
 
Figure 7: Sensitivity and specificity performance 
evaluation of the three classes. 
1,28 1,28
0
0,64
0 0
1,28
0
0,5
1
1,5
SVM NB DT RF XGB ADB DNN
0,99 0,99
1 1 1 1
0,99
0,985
0,99
0,995
1
SVM NB DT RF XGB ADB DNN
SV
M
NB DT RF XGB
AD
B
DN
N
Sensitivity of
Class A
95,83 95,83 100 97,87 100 100 95,83
Specifisity of
Class A
100 100 100 100 100 100 100
Sensitivity of
Class B
100 100 100 100 100 100 100
Specifisity of
Class B
97,8 97,8 100 98,88 100 100 97,8
Sensitivity of
Class C
100 100 100 100 100 100 100
Specifisity of
Class C
100 100 100 100 100 100 100
93
94
95
96
97
98
99
100
578 Informatica 45 (2021) 571–581 A. M. Alqudah et al. 
 
The hyperparameters of SVM, RF, DNN classifiers 
were also chosen based on the principle of GridSearchCV 
which is a library function that helps to loop through 
predefined hyperparameters and fit the model on the 
training set. So, in the end, it can select the best parameters 
from the listed hyperparameters. The other classifiers that 
were also used were NB, DT, XGB, and ADB. The results 
of all algorithms employed show an extremely high level 
of results based on all the dataset features used. Figure 10 
shows the results of accuracy, while Figure 11 shows the 
results for the error, where Figure 12 shows the results for 
the AUC, and finally, Figure 13 shows the results for the 
sensitivity and specificity of the three discretized classes 
that all obtained from the nine different ML algorithms. 
  
 
Figure 8: Training and Validation Accuracy. 
 
Figure 9: Training and Validation Loss. 
 
Figure 10: Accuracy performance evaluation. 
98,71 98,71 100 99,35 100 100 98,71
60
70
80
90
100
SVM NB DT RF XGB ADB DNN
 
Figure 11: Error performance evaluation. 
 
Figure 12: AUC performance evaluation. 
 
Figure 13: Sensitivity and specificity performance 
evaluation of the three classes.  
1,29 1,29
0
0,65
0 0
1,29
0
0,5
1
1,5
SVM NB DT RF XGB ADB DNN
0,99 0,99
1 1 1 1
0,99
0,985
0,99
0,995
1
SVM NB DT RF XGB ADB DNN
SV
M
NB DT RF XGB
AD
B
DN
N
Sensitivity of
Class A
95,83 95,83 100 97,87 100 100 95,83
Specifisity of
Class A
100 100 100 100 100 100 100
Sensitivity of
Class B
100 100 100 100 100 100 100
Specifisity of
Class B
97,8 97,8 100 98,88 100 100 97,8
Sensitivity of
Class C
100 100 100 100 100 100 100
Specifisity of
Class C
100 100 100 100 100 100 100
93
94
95
96
97
98
99
100
Reduced Number of Parameters for Predicting Post-Stroke Activities of... Informatica 45 (2021) 571–581 579 
 
 
Table 4: Comparing Proposed Methods with literature. 
Reference 
Number of 
Patients 
Number of Features  Methods Results (%) 
[2] 
48 of 8425 
identified citations 
were included 
 
Median Risk 
Bias 
7 out of 27 (range, 6–22) points 
[3] 313 individuals 15 
LR 
The AUC is 0.79 for LR and 
RF, while SVM is 0.77 
SVM 
RF 
[4] 
1,021 stroke 
survivors. 
13-items 
A latent 
growth curve 
model 
The time following a stroke 
increases, survivors attend to 
continuously upgrade with 
interest to ADLs. 
[5] 495, 1049 Patients 11 
Penalized 
Linear 
Blended 
Models 
RMSD for 3 BI points 
[6] 498 7 6 Methods  
[7] 241 19 
Two backward 
stepwise 
regression 
 
[8] 130 14 
Back 
Propagation 
Artificial 
Neural 
Network (BP-
ANN) 
Accuracy (38.33%) and 
cross-validated (48.51%) 
Learning 
Vector 
Quantization 
(LVQ) 
Accuracy (50.00%) and 
cross validated (58.96%) 
Self-
Organizing 
Mapping 
(SOM) 
Accuracy (53.33%) and 
cross validated (66.57%) 
Support Vector 
Machine 
(SVM) 
Accuracy (53.33%) and 
cross validated (71.47%) 
Particle 
Swarm-
Optimized 
SVM (PSO-
SVM). 
Accuracy (58.33%) and the 
highest cross-validated 
74.29% 
Proposed 
Method 
No 
Features 
Selection 
313 with 776 
Records 
15 
SVM Accuracy 98.71 
NB Accuracy 98.71 
DT Accuracy 100 
RF Accuracy 99.35 
XGB Accuracy 100 
ADB Accuracy 100 
DNN Accuracy 98.71 
Proposed 
Method 
with 
Features 
Selection 
313 with 776 
Records 
15 
SVM Accuracy 98.71 
NB Accuracy 98.71 
DT Accuracy 100 
RF Accuracy 100 
XGB Accuracy 100 
ADB Accuracy 100 
 
 
580 Informatica 45 (2021) 571–581 A. M. Alqudah et al. 
 
5 Discussions 
The present study aimed to investigate the influence of 
using many several ML algorithms for the classification 
of the BI score that represents the life living of patients 
after a stroke based on many features. Throughout the 
training step, all of the classifiers used were achieved an 
extraordinary scale of performances. While the testing 
results show that the used classifiers still show high 
accuracy. The results of this research based on the figures 
exhibited in the results showed that DT, XGB, and ADB 
classifiers achieved the most eminent performance 
reached 100% correctness in terms of accuracy, 
sensitivity, specificity, error, and AUC, for multi classify 
the digitized BI score, while the SVM and DNN classifiers 
are the worst. Moreover, using the features selection 
technique decreases the number of collected features in 
any future data collection from 15 to 5 only with a 
reduction ratio of 60% with the same results in classifiers 
except the RF enhanced from 98.71% to 100% in terms of 
accuracy. The results of feature selection mean that we can 
reduce the number of collected data and reducing the 
round time which makes patients more comfortable during 
data collection. Comparing the results of the proposed 
methods with other methods in the literature are shown in 
Table 4 The listed research studies in Table 4 have used 
the different datasets either collected by authors themself 
or by others. It is noted that they used a different number 
of classes, patients, and features set. These factors can 
affect the performance of the used classification methods 
significantly. However, most of the listed methods in the 
literature have achieved accepted recognition methods 
have high classification rates compared to other methods, 
the system is tested for time consumption in intel core i5-
6700 /3.4 GHz and 12 GB of RAM desktop computer 
using Python 3.9 on Spyder IDE. 
6 Conclusion 
In this research, a study on applying nine different 
machine learning algorithms for the prediction and multi-
classification of Barthel index which represents the 
activities of daily living of post-stroke patients in clinical 
practice. The research focused on finding the best 
classifier(s) for diagnosing the dependency of life living 
of post-stroke patients. Also, we have provided a features 
reduction methodology using the Chi-squared test to 
reduce the number of features in the datasets and during 
the round where they were collected from the patients. 
Experimental results show that the proposed method 
achieves very high accuracy when the BI score of three 
classes is classified even in the full or reduced features 
dataset. Therefore, the proposed method may be used 
effectively in hospitals with a lower number of features 
collected from patients for predicting the status of the life 
living of patients after a stroke. By comparing the 
proposed method with other methods in the literature, the 
present method is proven to be more effective and can 
provide a powerful tool for automatic stroke patient 
evaluation using the mentioned features. 
Conflict of interest 
The authors declare that they have no conflict of interest. 
This research did not receive any specific grant from 
funding agencies in the public, commercial, or not-for-
profit sectors. 
References 
[1] B. R. Wittenauer and L. Smith, “Priority Medicines 
for Europe and the World " A Public Health 
Approach to Innovation " Update on 2004 
Background Paper Written by Eduardo Sabaté and 
Sunil Wimalaratna Background Paper 6. 6 Ischaemic 
and Haemorrhagic Stroke,” Who, no. December, 
2012. 
[2] Veerbeek JM, Kwakkel G, van Wegen EE, Ket JC, 
Heymans MW. Early prediction of outcome of 
activities of daily living after stroke: a systematic 
review. Stroke. 42(5):1482-8,2011.  
http://doi.org10.1161/STROKEAHA.110.604090  
[3] Lin WY, Chen CH, Tseng YJ, Tsai YT, Chang CY, 
Wang HY, Chen CK. Predicting post-stroke 
activities of daily living through a machine learning-
based approach on initiating rehabilitation. 
International journal of medical informatics. 
111(1):159-64, 2018.  
http://doi.org/10.1016/j.ijmedinf.2018.01.002. 
[4] Pai HC, Lai MY, Chen AC, Lin PS. Change in 
activities of daily living in the year following a 
stroke: a latent growth curve analysis. Nursing 
research. 67(4):286-93, 2018.  
http://doi.org/ 10.1097/NNR.0000000000000280. 
[5] Douiri A, Grace J, Sarker SJ, Tilling K, McKevitt C, 
Wolfe CD, Rudd AG. Patient-specific prediction of 
functional recovery after stroke. International 
Journal of Stroke. 12(5):539-48, 2017. 
http://doi.org/10.1177/1747493017706241. 
[6] Bertolin M, Van Patten R, Greif T, Fucetola R. 
Predicting cognitive functioning, activities of daily 
living, and participation 6 months after mild to 
moderate stroke. Archives of Clinical 
Neuropsychology. 33(5):562-76, 2018.  
http://doi.org/10.1093/arclin/acx096. 
[7] Gialanella B. Predicting outcome a er stroke: the role 
of basic activities of daily living. Eur J Phys Rehabil 
Med. 49:629-37, 2013.  
[8] Lee JD, Chang TC, Yang ST, Huang CH, Hsieh FH, 
Wu CY. Prediction of quality of life after stroke 
rehabilitation. Neuropsychiatry. 6(6):369-75, 2016. 
http://doi.org/10.4172/Neuropsychiatry. 1000163. 
[9] Jin X, Xu A, Bie R, Guo P. Machine learning 
techniques and chi-square feature selection for 
cancer classification using SAGE gene expression 
profiles. InInternational workshop on data mining 
for biomedical applications, 9: 106-115, 2006. 
Springer, Berlin, Heidelberg.  
http://doi.org/ 10.1007/11691730_11. 
[10] Alqudah AM. Ovarian cancer classification using 
serum proteomic profiling and wavelet features a 
comparison of machine learning and features 
Reduced Number of Parameters for Predicting Post-Stroke Activities of... Informatica 45 (2021) 571–581 581 
 
selection algorithms. Journal of Clinical 
Engineering. 44(4):165-73, 2019.  
http://doi.org/1 0.1097/JCE.0000000000000359. 
[11] Ampomah EK, Nyame G, Qin Z, Addo PC, Gyamfi 
EO, Gyan M. Stock Market Prediction with Gaussian 
Naïve Bayes Machine Learning Algorithm. 
Informatica. 15;45(2), 2021.  
http://doi.org/ 10.31449/inf.v45i2.3407 
[12] Tiwari P, Dao H, Nguyen GN. Performance 
evaluation of lazy, decision tree classifier and 
multilayer perceptron on traffic accident analysis. 
Informatica. 13;41(1), 2017. 
[13] Alqudah AM. Towards classifying non-segmented 
heart sound records using instantaneous frequency 
based features. Journal of medical engineering & 
technology. 3;43(7):418-30, 2019. 
http://doi.org/1 0.1080/03091902.2019.1688408. 
[14] Babajide Mustapha I, Saeed F. Bioactive molecule 
prediction using extreme gradient boosting. 
Molecules. 21(8):983, 2016.  
http://doi.org/1 0.3390/molecules21080983. 
[15] Sun Y, Liu Z, Todorovic S, Li J. Adaptive boosting 
for SAR automatic target recognition. IEEE 
Transactions on Aerospace and Electronic Systems. 
7;43(1):112-25, 2007.  
http://doi.org/10 .1109/TAES.2007.357120. 
[16] Alqudah AM, Alquran H, Qasmieh IA. 
Classification of heart sound short records using 
bispectrum analysis approach images and deep 
learning. Network Modeling Analysis in Health 
Informatics and Bioinformatics. 9(1):1-6, 2020. 
http://doi.org/10.1007/s13721-020-00272-5. 
[17] Alqudah AM, Alquraan H, Qasmieh IA. Segmented 
and non-segmented skin lesions classification using 
transfer learning and adaptive moment learning rate 
technique using pretrained convolutional neural 
network. InJournal of Biomimetics, Biomaterials 
and Biomedical Engineering, 42 :67-78, 2019. Trans 
Tech Publications Ltd.  
http://doi.org/10.4028/www.scientific.net/JBBBE.4
2.67.  
[18] Malkawi A, Al-Assi R, Salameh T, Alquran H, 
Alqudah AM. White blood cells classification using 
convolutional neural network hybrid system. In2020 
IEEE 5th middle east and Africa conference on 
biomedical engineering (MECBME) 27: 1-5, 2020. 
IEEE. 
http://doi.org/10.1109/MECBME47393.2020.92651
54. 
[19] Alqudah A, Alqudah AM, Qazan S. Lightweight 
Deep Learning for Malaria Parasite Detection Using 
Cell-Image of Blood Smear Images. Rev. 
d'Intelligence Artif.. 34(5):571-6, 2020.  
http://doi.org/10.18280/ria.340506 
[20] Alquran H, Alqudah AM, Abu-Qasmieh I, Al-
Badarneh A, Almashaqbeh S. ECG classification 
using higher order spectral estimation and deep 
learning techniques. Neural Network 
World.1;29(4):207-19, 2019.  
http://doi.org/10.1 4311/NNW.2019.29.014. 
  
582 Informatica 45 (2021) 571–581 A. M. Alqudah et al.