https://doi.or g/10.31449/inf.v47i7.4583 Informatica 47 (2023) 81–90 81 A New Multimedia W eb-Data Mining Appr oach based on Equivalence Class Evaluation Pipelined to Featur e Maps onto Planar Pr ojection M. Ravi 1, 3 , M. Ekambaram Naidu 2 and G. Narsimha 3 1 CMR Institute of T echnology , Hyderabad-501401, T elangana, India 2 SRK Institute of T echnology , V ijayawada-521 108, India 3 Jawaharlal Nehru T ech University , Hyderabad-500085, India E-mail: ravimogili@gmail.com, menaidu2005@yahoo.co.in, narsimha06@gmail.com Keywords: supervised learning, information retrieval, multimedia web-databases, statistical metrics, feature extraction Received: Dec 25, 2022 Multimedia information ar e semi-or ganized or unstructur ed information elements whose essential sub- stance is separately or by and lar ge utilized for corr espondence. Sight and sound information mining r ecognizes, arranges, and r ecovers important highlights fr om an assortment of media to r ecognize en- lightening examples furthermor e, connections for information acquisition. Computer V ision (CV)-based systems have been incr easingly popular in r ecent years, owing to the gr owing number and complexity of datasets. In CV , finding meaningful photos in a huge dataset is a difficult task to solve. T raditional sear ch engines r etrieve photos based on text such as captions and metadata, but this strategy can r esult in a lot of irr elevant output, not to speak the time, effort, and money r equir ed to tag this textual data. In this paper , we pr oposed a pipelined deep learning oriented methodology framework f or multimedia web- data mining based on content extracted featur e maps in planner pr ojection as input. Color , textur e, form, and other high-level pr operties of images ar e r epr esented as numerical featur e vectors. This technique is based on the following computer vision tasks in general i.e., Image segmentation, Image classification, Ob- ject detection etc. In or der to pr ove the computational efficiency and to validate its statistical behaviour , we have also pr esented the experimental evaluation on an standar d multimedia dataset. The obtained performance r esults ar e then compar ed with some significant existing appr oaches in the terms of various statistical measur es/parameters. Povzetek: Pr edstavljena je metoda rudarja multimedijev z globokim učenjem, ki temelji na lastnostih vse- bine slik. Uporablja se za različne naloge računalniškega vida, kot so segmentacija, klasifikacija in zaz- navanje objektov . Pr eizkušena je bila na standar dnem multimedialnem naboru podatkov . 1 Intr oduction Multimedia information mining involves identifying, cat- egorizing, and retrieving relevant features from a variety of media to identify educational patterns and relationships for data acquisition. The performance of multimedia in- formation mining is directly impacted by the level of in- formation contained in the data, which can be recognized during the data pre-processing stage. Data pre-processing aims to reduce data dimensionality by preserving useful features, which are factors (or attributes) used as input for selection algorithms. This greatly improves runtime (train- ing time), predictive accuracy , and result readability when working with human interpretation and knowledge. There are two key steps in data pre-processing: feature extrac- tion (FE) and feature selection (FS). FS techniques are a subset of the broader field of FE. While FE extracts mul- tiple features from the original data to generate a dataset, FS selects a subset of original variables to build the model. However , identifying related features provides insights into underlying phenomena. On the other hand, eliminating ir - relevant variables enhances the accuracy of data classifica- tion. Multimedia web-information mining includes various applications such as face recognition, pattern recognition, text detection and recognition in images and video frames, biomedicine, emotion recognition, image retrieval and clas- sification, and image annotation, among others. 1.1 Multimedia Multimedia information are semi-or ganized or unstructured information elements whose essential substance is exclu- sively or all in all utilized for correspondence. The term sight and sound covers a variety of media article like a mix of text, picture, video, sound, numeric, sound, liveliness, graphical, and straight out information on a PC show ter - minal. In the approach of programming innovation, sight and sound is characterized as ”a PC program comprised of messages, realistic, sound and pictures and animations”. T ext is made out of an enormous number of disor ganised and nebulous characters of normal language text [22, 23]. W ritings can be inserted in recordings in two structures, 82 Informatica 47 (2023) 81–90 M. Ravi et al. specifically , inscription and scene messages [24]. Subti- tle text alludes to the writings overlapped during video al- tering, while scene oriented text is the writings that nor - mally exist during the video catches. T o manage the im- planted content for a proper trade of data (e.g., title as well as date of an occasion, name of the speakers), sub- title text can be extricated straightforwardly from record- ings and pictures since it is just positioned in comparison to these visual representations. Sound is a sign coming about because of motions or pressing factor varieties created by a moving or vibrating body tempestuous liquid stream in a flexible medium like air , water , and solids [25]. Uzun and Sencar [26] characterized sound into discourse, mu- sic, non-discourse as well as non-musical signals, as well as complicated sound combinations that emer ged from a few particular acoustic sources. Sound substance can be naturally examined and looked inside itself. For instance, discourse w ith in sound discovery is generally utilized for distinguishing proof or acknowledgment, for example, sup- porting proof assortment and diagnosing Alzheimer ’ s dis- ease. 1.2 Challenges in multimedia data mining There are plentiful measure of dif ficulties in the multime- dia information mining. Information with high and com- plex measurements experience the ill ef fects of the scour ge of dimensionality [27]. In these wonders, the measure of information required to help the outcomes develop dramat- ically with dimensionality , accordingly sabotaging the pre- sentation of information mining [28-29]. At the end of the day , the volume of data space increments and involves the sparsity of dependable information through an additional measurement in the investigation information. High di- mensional information entangle sight and sound examina- tion, stockpiling, and recovery . The significant component space ef fectively prompts overfitting; when the information digging mechanism looks for the best boundaries utilizing restricted information, it evokes both the overall examples in the information and the commotion explicit to the infor - mation. As far as precision, ef fectiveness, and dependabil- ity , recovering an ideal mixed media content (e.g., image) in a continually developing information base is confounded in light of the trouble in acquiring a legitimate image high- light set. However , it has been on the whole concurred that the arti- cle image explanation is fundamentally an order issue, most web pictures are multi-marked. Along these lines, an im- age can likewise mirror a few semantic ideas. Multi-name characterization likewise has an alternate set of assessment measures contrasted with single-name grouping. The tradi- tional calculations for the most part change multi-name ar - rangement issues several parallel order issues for each idea. T wo sorts of commotion existed in multimedia information. The main kind is identified with the foundation commo- tion in mixed media information. Foundation commotion can happen in recordings or sound information in light of a few reasons, counting foundation voices recorded through an amplifier and defective melody replicating framework. order precision may be made even better with the deter - mination of appropriate discriminative element gatherings. Subsequently , a decent model with a decent FS capacity is needed to improve both exactness and productivity , that is a high AUC esteem with a similar chose highlight mea- surement shows a agreeable execution with a low involved component unit number . Assessment of highlights individ- ually implies that the significance of each features are ex- clusively chosen for the unmistakable includes as opposed to considering the relationship among attributes. 1.3 Generalized methods for multimedia web-data mining The general methods of multimedia web-data mining can be categorized in the following two ways i.e. (i) based on input text word(s) query as input (ii) based on extracted feature maps in planner projection as input. 1.3.1 Multimedia web-data mining based on input text word(s) query as input W ords or sentences that convey content are known as catchphrases. They may be used to represent photos, text archives, database entries, and W eb pages as metadata. Set- ting up catchers for a photo allows you to retrieve, record, sort, and see lar ge amounts of image data. Catchphrases are utilized on the W eb in two unique manners: I) Keywords as a scan terms for web crawlers ii) Keywords that recognize the substance of the site. A comment is metadata appended to text, picture, or other information. It alludes to a partic- ular piece of the unique information or image. Limitations in this type of methods:- The limitations in this type of methods are as follows:- – The assignment of envisioning picture content is pro- foundly emotional. – Accompany the significant list items; it very well may be a huge number of unessential query items It might mean that the evidenced pursuit’ s precision is low . – The text based portrayals given by an annotator ought to be unique in relation to the next client. For dif ferent people, an image represents dif ferent things. It might also imply dif ferent things to dif ferent people at dif- ferent times. 1.3.2 Multimedia web-data mining based on extracted featur e maps in planner pr ojection as input T o overcome the limitations of multimedia W eb-Data Min- ing based on input text word(s) query as input, we can use the extracted feature maps in planner projection as input A New Multimedia W eb-Data Mining Approach… Informatica 47 (2023) 81–90 83 based mining. It is the use of computer vision for recov- ering the image object. Data recovery implies the way to- ward changing over a solicitation for data into a significant arrangement of reference. It is an innovation that in rule puts together computerized image files as per their visual substance. This framework recognizes the extraordinary locales present in a image dependent on their similitude in shading, surface, shape, and so on and chooses the compa- rability between two pictures by retribution the closeness of these variety of areas. 1.4 Related work This section presents state of the art developments carried out in this domain over recent past years. Cimino [1] pro- posed a hereditary span neural network architecture utiliz- ing the dividing estimation of the data points on computa- tional space. Froelich [2] proposed a definite time arrange- ment demonstrating technique that utilizes data granules as expected. Hmouz [3] proposed a time series oriented ex- pectation model utilizing granular time series. Zhu [4] pro- posed a hybrid variety of fuzzy model that consolidates the T akagi–Sugeno (TS) fuzzy logic model and data division as well as allocation technique. Musaylh [5] proposed a kind of model that predicts transient power interest in Queens- land utilizing SVR and ARIMA. Likewise, grouping tech- nique for information investigation had been the subject of various examinations, among them, a specific interest was paid to setting based fuzzy C-implies (CFCM) grouping utilizing fuzzy C-implies (FCM) grouping. Emmanuel D [6] given description of Evolutionary Deep Networks for Ef ficient Machine Learning. A. Sellami et al. [7] given the similar investigation of dimensionality reduction strate- gies for images examination and exploration. M Kayed et al., in their work [8], performed categorization of the gar - ments from fashion MNIST dataset exploiting CNN LeNet- 5 specific architecture. Image object’ s pattern analysis is performed by Imran K. et al. [9]. Rim Rekik et al. [10] proposed a computational cycle of gathering and separat- ing information (measures including sites) from a rundown of studies. M. Stamenovic et al. [1 1] given a visual classifier , valuable for deducing a record’ s general appearance, and a text clas- sifier , for settling on content-informed decision choices. Reshma P .K. et al. [12] given a soft computing frame- work for web mining of multimedia data. MJ Sindhu [13] given a framework for multimedia retrieval using web min- ing. Seema S [14] given an extensive survey machine learn- ing techniques for data mining. The measurable data and probabilistic information is utilized for meta information creation. In [15], Bayes’ hypothesis is exploited with ba- sic freedom guesses among highlights. The augmentation of data set application to deal with sight and sound arti- cles requires synchronization of various media information streams [16]. Peter et al. [17] portray the significance of pre-preparing information in the web use mining and the ef fect of errors to the investigation of the information dur - ing this stage. It is fundamental for not disparage the in- formation pre-processing stage during the time spent web use mining, as the pre-handling stage straightforwardly in- fluences the nature of the gained information. In [18], The Remote Sensing (RS) image recovery strategies and appli- cations are completely inspected. The assessment datasets and measurements of RS image recovery are summed up. Exhibitions on two sorts of exemplary RS image recovery assignments are likewise talked about by the authors. G. Suseendran et al.[35], in their work, have illustrated deep learning Semi-structured T ree Miner for Data Stream’ s al- gorithm focused on frequent pattern mining used in data streams on semi-structured data. Singh et al., in their works [36][37][38][39] have given robust frameworks and meth- ods for processing multimedia data exploiting variety of soft computing based computational primitives. T able 1: Research gap / limitations Appr oach Resear ch Gap Froelich et al. [2] Increased complexity , Dif ficulty in capturing long- term trends, Data sparsity . Zhu et al. [4] Interpretability , Overfitting, Scalability , Lack of gen- erality . Musaylh et al. [5] Limited ability to capture complex patterns, Sensitiv- ity to outliers, Limited ability to handle seasonality . Sellami et al. [7] Loss of information, Selection of features to reduce, Lack of transparency , Suf fers from Overfitting. Kayed et al. [8] Computationally intensive, Lack of generalization, Data requirements. Peter et al. [17] Massive Data requirements, Computationally inten- sive, Sensitivity to image quality . Y ansheng et al. [18] Atmospheric interference more, Limited ground truth data, Limited spectral coverage. Some significant state of the art approaches along with their limitations in the form of research gaps are given in T able 1.4. 1.5 Contribution(s) in this paper The contribution(s) in this work are as follows:- – As research contribution, we proposed a pipelined deep learning oriented methodology framework for multimedia web-data mining based on extracted fea- ture maps in planner projection as input. – In order to prove the computational ef ficiency and to validate its statistical behaviour , we have also pre- sented the experimental evaluation on an standard multimedia dataset. The obtained performance results as compared to others significant existing approaches in terms of various statistical measures/parameters. 1.6 Organization of the paper The remaining portions of this paper are structured in a log- ical and or ganized manner to present the research method- ology and results in a clear and concise manner . In Sec- tion 2, the proposed method is described in detail. This 84 Informatica 47 (2023) 81–90 M. Ravi et al. section provides an overview of the key components and techniques used in the research, highlighting the unique as- pects of the proposed method. Section 3 is dedicated to discussing the experimental evaluation and the results ob- tained from the research. Finally , in Section 4, the pa- per concludes with a conclusive summary of the research and the future scope of this work. This section provides a brief overview of the main contributions of the research and highlights its potential impact on the field. 2 Pr oposed method This section presents our proposed framework, architecture and methodology overview along with the detailed algorith- mic flow . Algorithm 1 Feature Extraction 1: Five methods have been used in this phase - { – Color▶ performs color feature extraction – Daisy▶ performs daisy features like intensity – Edge▶ performs edge feature extraction – Histogram of gradient▶ extracts gradient orien- tation in the image – Scale-invariant featur e transform (SIFT)▶ The gradient at every pixel is viewed as an example of a three-dimensional rudimentary component vec- tor , shaped by the pixel area and the angle direc- tion. } 2: These methods individually does not provide the re- sults as each method only deals with a particular spec- ification like color , intensity , edges etc. So all these methods are combined in the form of an Ensemble. Algorithm 2 Inconsistency Measurement 1:∀Z = { V 1 ,V 2 ,··· V n } in approximation universal space for particular conceptC , compute - – ZC ={ x∈ U| [x]⊆ C} – ZC ={ x∈ U| [x]∩ C̸=∅} where, [x] is equivalence class 4: Compute,R I =ZC -ZC 5: IF (R I == NULL) { No inconsistent region } Algorithm 3 Optimal V ariable’ s / Features Selection 1: BEGIN 2: F un (A, L, I,δ ) 3: A: Set of attributes / features / variables 4: L: Label column 5: I: Set of instances 6: δ : Threshold 7: fun cal() : Function to calculate core features 8:R← fun cal() 9: whileγ R (L) <δ 10: I← I -POS R (L) 1 1: ∀ x∈ (A -R ) 12: p a =|POS R∪{ a} (L)| 13: q a ←| EquivClass MAXLEN (POS R∪{ a} (L))| 14: Choose a value with lar gest ofp a × q a 15: R←R∪{ a} 16: ReturnR 17: END 2.1 Ar chitectur e and methodology overview Computer vision techniques and deep learning algorithms together are used for the feature extraction process. Each image has been represented as a feature vector . Similarity techniques such as Cosine Similarity and Euclidean Dis- tance are used to measure the closeness between feature vectors of the query image and the images available in the dataset. Soft computing based statistical algorithmic mod- ule is adopted for an optimized variable’ s selection. 2.2 Detailed algorithmic flow Our approach Removes the typical method of storing pho- tos in databases, which involves labelling them according to their content and retrieving them using key words. In this proposed methodology , multimedia web-data mining is performed based on extracted feature maps in planner pro- jection as input rather than based on the input meta-data such as keywords, tags, or descriptions associated with the image. There are in total four main algorithmic modules in it. It is primarily based on equivalence class evaluation pipelined to feature maps onto planar projection. Algorithmic block 1 is performing features extraction. Algorithmic block 2 is performing inconsistency measurement. Computationally optimal variables’ selection is carried out in algorithm 3. Algorithmic block 4 is performing transfer learning proce- dure. A New Multimedia W eb-Data Mining Approach… Informatica 47 (2023) 81–90 85 Algorithm 4 T ransfer Learning Proc() 1: VGGNET + RESNET ar chitectur e exploitation: – Input→ fixed 224× 224 dimension pixels image – Subtract mean RGB value, computed on the train- ing set, from each pixel – Use filters in convolution layer: 3× 3 dimension – Fix convolution stride to 1 pixel; padding is 1 pixel for 3× 3 conv . layers – Exploit Lr MAXPOOLING(1) ··· Lr MAXPOOLING(5) – Max-pooling↬ { pixel window = 2× 2, stride = 2} – Process in Fully-connected layers: 4096 channels – Process in soft-max layer Our proposed methodology framework has some com- putational advantages over other methods available in the literature. The discussion regarding the same is given as points below:- – Proposed algorithmic module of Dimension reduction based feature selection has the ability to reduce the number of features or variables in a dataset, which can make the dataset more manageable and reduce the computational complexity of subsequent analyses or modeling tasks. This can result in faster computation and reduced resource requirements. – Proposed module for transfer learning based classifi- cation can learn complex non-linear relationships in data without requiring explicit feature engineering, which can save time and ef fort in the data mining pro- cess. – Our proposed methodology can perform well in tasks with imbalanced datasets, where the number of exam- ples in each class is dif ferent, by adjusting the decision threshold. This can improve the model’ s performance on the minority class. 3 Experimental analysis This section presents the experimental evaluation in form of simulation set-up details, dataset summary , exploited com- putational libraries and packages, obtained results summary and comparisons of our proposed method results with some significant existing approaches. 3.1 Simulation set-up In this study , the simulation environment is built up as fol- lows: Our operating system was Ubuntu 18.04 L TS, and our hardware included 8 GB of RAM and an Intel Core i7 4032U CPU processor running at 3.2GHz. 3.2 Dataset overview For the purpose of experimental analysis, Corel 10K dataset [34] is utilized here that features 10,000 pictures from vari- ous substances such as dusk, seashore, bloom, building, car , horses, mountains, fish, food, and doorway , among others. Each class includes 100 JPEG images with a resolution of 192times 128 or 128times 192. T en classes were chosen for trial simulation purposes from the dataset’ s total of 100 classes. 3.3 Exploited computational libraries and packages Python 3.7.4 is used for the implementation. The compu- tational libraries and packages used are - NumPy , SciPy , pandas, Matplotlib, Statsmodels and PyT orch. PyT orch is an essential requirement for running ResNet and VGGNet modules. 3.4 Results summary The generated confusion matrix is a 10× 10 matrix as the total number of classes considered here are 10. T able 2 rep- resents the statistical performance measures corresponding to each individual category in terms of precision, recall and accuracy . Numerous test cases for experimental evaluation are carried out and obtained results images are given for input query instances and corresponding output result in- stances (Figure 3 - Figure 12). 3.4.1 Confusion matrix Figure 1: Confusion matrix 86 Informatica 47 (2023) 81–90 M. Ravi et al. 3.4.2 Statistical performance measur es T able 2: Statistical performance measures Label / Category Pr ecision Recall Accuracy 0 0.6 1 0.9 1 1 1 0.89 2 1 1 0.9 3 1 1 0.93 4 1 1 0.98 5 0.6 0.6 0.71 6 1 0.6 0.89 7 1 0.8 0.94 8 1 0.2 0.78 9 1 1 0.98 The depiction for statistical performance measures is rep- resented as Figure 2. Figure 2: Depiction for Statistical Performance Measures Figure 3: i/p query instance (buildings) Figure 4: o/p result instances (buildings) Figure 5: i/p query instance (flowers) Figure 6: o/p result instances (flowers) A New Multimedia W eb-Data Mining Approach… Informatica 47 (2023) 81–90 87 Figure 7: i/p query instance (tr ees) Figure 8: o/p result instances (tr ees) Figure 9: i/p query instance (mountains) Figure 10: o/p result instances (mountains) Figure 1 1: i/p query instance (vegetables) 88 Informatica 47 (2023) 81–90 M. Ravi et al. Figure 12: o/p result instances (vegetables) 3.5 Comparisons on performance metrics In order to prove the computational novelty of proposed framework, comparisons are performed with some signif- icant existing approaches and the results are provided in ta- ble 3. From this table, it can be observed that the proposed framework outperforms over other existing approaches. T able 3: Comparative analysis Method Pr ecision V alues Recall V alues MTH [30] 40.87 49.1 CPV -THF [31] 52.28 62.7 STH [32] 48.03 57.6 CMSD [33] 50.25 60.3 Proposed framework 92.0 82.0 4 Conclusion and futur e scope The performance of media information mining is directly influenced by the level of information contained in the data, which can be recognized during the data pre-processing stage. This paper presents a pipelined deep learning methodology framework for multimedia web-data mining based on extracted feature maps in planner projection as input. The framework is designed to be oriented towards deep learning techniques. An experimental evaluation is also conducted on a standard multimedia dataset [34]. The obtained performance results are compared with some sig- nificant existing approaches in terms of various statistical measures and parameters. 4.1 Futur e scope The future scope of this research work is to analyze and identify interrelationships within Multimedia data sets as well as to derive a composite score from several dif ferent sub-scores. Refer ences [1] Cimino, M.G.C.A.; Lazzerini, B.; Marcelloni, F .; Pedrycz, W . Genetic interval neural networks for granular data regression. Inf. Sci. 2014, 257, 313–330. url: https://doi.or g/10.1016/j.ins.2012.12.049. [2] Froelich, W .; Pedrycz, W . Fuzzy cognitive maps in the modeling of granular time series. Knowl.-Based Syst. 2017, 1 15, 1 10–122. url: https://doi.or g/10.1016/j.knosys.2016.10.017. [3] Hmouz, R.A.; Pedrycz, W .; Balamash, A. De- scription and prediction of time series: A general framework of granular computing. Ex- pert Syst. Appl. 2015, 42, 4830–4839. doi: https://doi.or g/10.1016/j.eswa.2015.01.060. [4] Zhu, X.; Pedrycz, W .; Li, Z. A design of granular T akagi-Sugeno fuzzy model through the syner gy of fuzzy subspace clustering and opti- mal allocation of information granularity . IEEE T rans. Fuzzy Syst. 2018, 26, 2499–2509. doi: https://doi.or g/10.1 109/tfuzz.2018.2813314. [5] Musaylh, M.S.A.; Deo, R.C.; Adamowski, J.F .; Li, Y . Short-term electricity demand forecasting with MARS, SVR and ARIMA models using aggregated demand data in Queensland, Aus- tralia. Adv . Eng. Inform. 2018, 35, 1–16. doi: https://doi.or g/10.1016/j.aei.2017.1 1.002. [6] Emmanuel D, B.A. Bassett, EDEN: Evolutionary Deep Networks for Ef ficient Machine Learning, arXiv preprint doi arXiv: 1709.09161, 2017. A New Multimedia W eb-Data Mining Approach… Informatica 47 (2023) 81–90 89 [7] A. Sellami and M. Farah, ”Comparative study of di- mensionality reduction methods for remote sensing images interpretation,” 2018 4th International Confer - ence on Advanced T echnologies for Signal and Im- age Processing (A TSIP), Sousse, 2018, pp. 1-6, doi: 10.1 109/A TSIP . 2018.8364490. [8] M. Kayed, A. Anter and H. Mohamed, ”Clas- sification of Garments from Fashion MNIST Dataset Using CNN LeNet-5 Architecture,” 2020 International Conference on Innovative T rends in Communication and Computer Engineering (ITCE), Aswan, Egypt, 2020, pp. 238-243. doi: https://doi.or g/10.1 109/itce48509.2020.9047776. [9] Imran Khan, Asif Khan, Riaz Ahmed Shaikh, Ob- ject analysis in image mining, 2015 2nd International Conference on Computing for Sustainable Global De- velopment (INDIACom), 1 1-13 March 2015, IEEE. [10] Rim Rekik, Ilhem Kallel, Jor ge Casillas, Adel M.Alimi, Assessing web sites quality: A systematic literature review by text and association rules mining, International Journal of Information Management, V olume 38, Issue 1, February 2018, Pages 201-216. doi: https://doi.or g/10.1016/j.ijinfomgt.2017.06.007. [1 1] Marko Stamenovic, Sam Schick, Jiebo Luo, Machine Identification of High Impact Research through T ext and Image Analysis, 2017 IEEE Third International Conference on Multimedia Big Data, 19-21 April 2017, IEEE, doi: 10.1 109/BigMM.2017.63. [12] Reshma P .K., Lajish V .L., W eb Mining for Multime- dia Data-A Soft Computing Framework, International Journal of Scientific & Engineering Research, V olume 5, Issue 9, September -2014. [13] Manda Jaya Sindhu, Y . Madhavi Latha, V . Samson Deva Kumar , Suresh Angadi, Multimedia Retrieval Using W eb Mining, International Journal of Recent T echnology and Engineering (IJR TE) ISSN: 2277- 3878, V olume-2, Issue-1, March 2013. [14] Seema Sharma, Jitendra Agrawal, Shikha Agar - wal, Sanjeev Sharma, Machine Learning T ech- niques for Data Mining: A Survey , 2013 IEEE International Conference on Computa- tional Intelligence and Computing Research. doi: https://doi.or g/10.1 109/iccic.2013.6724149. [15] Luis Enrique Sucar , Bayesian Classifiers, Probabilis- tic Graphical Models, Advances in Computer V i- sion and Pattern Recognition pp. 41-62, (2015). doi: https://doi.or g/10.1007/978-1-4471-6699-3_4. [16] P . K. Y adav and S. Rizvi, ”An exhaustive study on data mining techniques in mining of Multime- dia database,” 2014 International Conference on Is- sues and Challenges in Intelligent Computing T ech- niques (ICICT), Ghaziabad, 2014, pp. 541-545, doi: 10.1 109/ICICICT .2014.6781339. [17] Peter Svec, Lubomir Benko, Miroslav Kadlecik, Jan Kratochvil, Michal Munk, W eb Usage Min- ing: Data Pre-processing Impact on Found Knowl- edge in Predictive Modelling, Procedia Computer Science, V olume 171, 2020, Pages 168-178. doi: https://doi.or g/10.1016/j.procs.2020.04.018. [18] Y ansheng Li, Jiayi Ma, Y ongjun Zhang, Image re- trieval from remote sensing big data: A survey , Infor - mation Fusion, V olume 67, 2021, Pages 94-1 15. doi: https://doi.or g/10.1016/j.inf fus.2020.10.008. [19] Z. Pawlak, Information systems theoretical foun- dations, Information systems V ol.6 (3):pp. 205- 218 (1981). doi: https://doi.or g/10.1016/0306- 4379(81)90023-5. [20] Cunningham P ., Dimension reduction. T echnical Re- port: UCD-CSI-2007-7, (2007). [21] Y an, J., Zhang, B., Liu, N., Y an, S., Cheng, Q., Fan, W ., Y ang, Q., Xi, W ., Chen, Z. Ef fective and ef ficient dimensionality reduction for lar ge-scale and stream- ing data preprocessing, IEEE T ransaction on Knowl- edge and Data Engineering, V ol. 18, No. 3, 320-333, (2006). doi: https://doi.or g/10.1 109/tkde.2006.45. [22] J.-H. Seok, J.H. Kim, Scene text recognition using a Hough forest implicit shape model and semi-Markov conditional random fields, Pattern Recogn. 48 (2015) 3584-3599. doi: https://doi.or g/10.1016/j.patcog.2015.05.004. [23] I.H.W itten, E. Frank, Data mining: Practicalmachine learning tools and techniques, Mor gan Kaufmann, 2005. [24] P . Shivakumara, T .Q. Phan, C.L. T an, A ro- bust wavelet transform based technique for video text detection, 2009 1285–1289. doi: https://doi.or g/10.1 109/icdar .2009.83. [25] C.H. Hansen, Fundamentals of acoustics, in: B.H. Goelzer , C.H. Hansen, G.A. Sehrndt (Eds.), Occupa- tional Exposure to Noise: Evaluation, Prevention and Control,W orld Health Or ganization, Geneva, 2001. [26] E. Uzun, H.T . Sencar , A preliminary examina- tion technique for audio evidence to distinguish speech from non-speech using objective speech qual- ity measures, Speech Comm. 61-62 (2014) 1–16. doi: https://doi.or g/10.1016/j.specom.2014.03.003. [27] R. Bellman, R.E. Bellman, R.E. Bellman, R.E. Bell- man, Adaptive control Processes: a guided tour , Princeton University Press, Princeton, 1961. doi: https://doi.or g/10.1002/nav .3800080314. [28] M. V erleysen, D. François, The curse of dimen- sionality in data mining and time series pre- diction, Computational Intelligence and Bioin- spired Systems, Springer 2005, pp. 758–770. doi: https://doi.or g/10.1007/1 1494669_93. 90 Informatica 47 (2023) 81–90 M. Ravi et al. [29] C.M. Bishop, Pattern recognition and machine learn- ing, Springer , 2006. [30] G.-H. Liu, L. Zhang, Y .-K. Hou, Z.-Y . Li, and J.-Y . Y ang, “Image retrieval based on multi-texton histogram,” Pattern Recogni- tion, vol. 43, no. 7, pp. 2380–2389, 2010. doi: https://doi.or g/10.1016/j.patcog.2010.02.012. [31] A. Raza, H. Dawood, H. Dawood, S. Shabbir , R. Mehboob, and A. Banjar , “Correlated primary visual texton histogram features for content base image re- trieval,” IEEE Access, vol. 6, pp. 46595–46616, 2018. doi: https://doi.or g/10.1 109/access.2018.2866091. [32] A. Raza, T . Nawaz, H. Dawood, and H. Da- wood, “Square texton histogram features for im- age retrieval,” Multimedia T ools and Applica- tions, vol. 78, no. 3, pp. 2719-2746, 2019. doi: https://doi.or g/10.1007/s1 1042-018-5795-x. [33] H. Dawood, M. H. Alkinani, A. Raza, H. Da- wood, R. Mehboob, and S. Shabbir , “Correlated mi- crostructure descriptor for image retrieval,” IEEE Access, vol. 7, pp. 55206–55228, 2019. doi: https://doi.or g/10.1 109/access.2019.291 1954. [34] https://sites.google.com/site/dctresearch/Home/ [35] G. Suseendran, D. Balaganesh, D. Akila and S. Pal, ”Deep learning frequent pattern mining on static semi structured data streams for im- proving fast speed and complex data streams,” 2021 7th International Conference on Optimiza- tion and Applications (ICOA), 2021, pp. 1-8, doi: 10.1 109/ICOA51614.2021.9442621. [36] Arappal, N., Singh, A., Saidulu, D. (2022, Septem- ber). A Soft Computing Based Approach for Pixel Labelling on 2D Images Using Fine T uned R- CNN. In International Conference on Innovations in Computer Science and Engineering (pp. 415- 424). Singapore: Springer Nature Singapore. doi: https://doi.or g/10.1007/978-981-19-7455-7_61. [37] Singh, A., T iwari, V ., T entu, A. N. (2022, March). Ceiling improvement on breast cancer prediction accuracy using unary KNN and binary LightGBM stacked ensemble learning. In Proceedings of the Sev- enth International Conference on Mathematics and Computing: ICMC 2021 (pp. 451-471). Singapore: Springer Singapore. doi: https://doi.or g/10.1007/978- 981-16-6890-6_34. [38] Singh, A., T iwari, V . (2019). An optimal di- mension reduction-based feature selection and classification strategy for geospatial imagery . In- ternational Journal of Knowledge Engineering and Soft Data Paradigms, 6(2), 120-138. doi: https://doi.or g/10.1504/ijkesdp.2019.102851. [39] Singh, A., T iwari, V ., Gar g, P ., T entu, A. N. (2019). Reasoning for Uncertainty and Rough Set-Based Ap- proach for an Ef ficient Biometric Identification: An Application Scenario. In Computational Intelligence: Theories, Applications and Future Directions-V olume II: ICCI-2017 (pp. 465-476). Springer Singapore. doi: https://doi.or g/10.1007/978-981-13-1 135-2_35.