https://doi.org/10.31449/inf.v48i5.5345 Informatica 48 (2024) 29–40 29 Research on Automatic Recognition Technology of Library Books Based on Image Processing Haiyan Xun Library of Shandong Women's University, Jinan, Shandong, 250300, China E-mail: xunhaiyan2023@126.com Keywords: image processing, automatic recognition technology, image retrieval Received: The intelligence of computers is the future development direction. In today's society, the amount of information is increasing, which puts forward higher requirements for retrieval technology and automation levels. With the development and popularization of the Internet era, online shopping, study, and even work have become the norm in people's lives. However, a large amount of data is generated on the Internet every day, and how to obtain the information people need from this data is a key research problem today. It can be seen that the traditional global search or image search can no longer meet the amount of information people need, so the content-based search method will inevitably become a more popular database retrieval technology. In recent years, content-based image capture has become a research center in the field of image information removal. This article first studies image processing and capture technology. It provides a detailed overview of image segmentation technology and image segmentation models introduced in image processing, as well as three image segmentation methods. This article also introduces the basic principles and framework of the CBIR procurement system. The most important technology in the CBIR system is the analysis and description of image features, and several methods commonly used to express the content of feature images. Secondly, it analyzes the image edge detection algorithm in detail and finally introduces the functional division and workflow of the library book automatic recognition system. It also provides all the construction environments, processes, and algorithms to perform the basic functions of the system. Povzetek: Predstavljena je tehnologije obdelave in zajemanja slik, podrobno še segmentacija slik in metode segmentacije. 1 Introduction In modern scientific research and daily life, people usually prefer to obtain information through images. However, traditional manual identification has been unable to meet the needs of social development for the research on the automatic identification technology of artificial intelligence library books. With the development of various computer technologies and networks, the informatization development of various enterprises has become the current mainstream development direction. In this information age, people usually learn about social changes through website information. The library is the dissemination center of all kinds of information in colleges and universities [1-2]. With the popularization of information technology, major libraries have also established online libraries. Such online libraries can not only effectively manage the books in the library, but also facilitate people to read books. At the same time, the online library also speeds up the time for people to find the books they need. However, due to the large number of secretaries in the library, the order of books in the online library system is also entered according to the order of the books in the offline library. Therefore, there may be errors in the information of the book in the system due to the incorrect placement of the book. This has a very bad influence on the development of online libraries [3]. Image segmentation usually refers to the segmentation of digital images. The method of image segmentation is the process of dividing an image into several discontinuous regions. The basic idea is to classify and group images with similar feature values in digital images, and divide the images into regions with different important levels, thereby reducing the amount of data and information contained in each part of the image. The structure information is suitable for the later stage of image processing, and it is further processed to store information about the target structure while significantly reducing the amount of target data [4-5]. Image segmentation techniques can be divided into three categories. One is based on the extracted regional features or the morphological results obtained from the regional features. It uses thresholds, regional methods, and textures to classify image pixel groups; the second is to pay attention to the picture's border, which divides the margin to obtain the region according to the obtained operator. The last method is to use statistical features and 30 Informatica 48 (2024) 29–40 H. Xun prior knowledge features for image segmentation. This method is to segment first, then highlight the objects, and then combine them into segmented regions according to the domain of knowledge [6]. According to the three categories of image segmentation, the most commonly used segmentation techniques are the threshold method, edge detection method, and region improvement method. In threshold segmentation, the maximum threshold and blur threshold are more suitable for the segmentation of specific types of images (vessels) [7-8]. Although that which separates the background from the target is not very obvious, the segmentation effect is not very good; the method of edge detection is to first detect the pixels along the edge of the image, and then stitch the end pixels together to form a segmented area. The edge detection method uses part of the window operation to find the edge information in the image. This method is suitable for images with a solid color background [9-10].Literature survey are discussed in Table 1. Table 1: Related works Ref. No Methodologies Summary of findings Limitation [11] To facilitate effective deep-learning medical image processing, the study presents TorchIO, a Python package. Patch-based sampling, spatial metadata consideration, and data augmentation are highlighted as ways to overcome the difficulties associated with processing MRI and CT images Patch-based sampling, augmentation, preprocessing, and loading are made easier by TorchIO. It supports composition, transform inversion, and simulation of MRI-specific artifacts. It interfaces with PyTorch and conventional medical image processing libraries. Open-science principles are encouraged by the availability of the source code, tutorials, and documentation. The efficiency of TorchIO, which simplifies medical image processing, could differ depending on the application case. The modularity and interoperability of the library with various deep-learning frameworks for medical pictures should be taken into consideration by users. [12] To extract gear tooth profiles properly without depending on conventional meshing theory, the research presents a novel method called Engagement-Pixel Image Edge Tracking (EPIET). The process entails extracting meshing points, calibrating tool locus coordinates, and obtaining immediate contact images. The method's efficiency in removing tooth profile edges is illustrated by a case study on a face gear. Its promise for computerized design of complicated conjugate curved surfaces is demonstrated by the results, which show feasible accuracy and stability when compared to standard meshing equations. The paper discusses major error sources associated with the presented method. While it shows promise for gear profile extraction, further research is needed to explore its applicability to diverse gear types and potential limitations in complex scenarios. [13] Using a case study methodology, this paper examines how public libraries, namely the Seattle Public Library (SPL) system, curate and use available demographic data. As part of the study, use cases are created, a dashboard tool prototype is developed using available census data, and information needs are identified through interviews with SPL regional managers. The study provides new perspectives on how open data might be used to help SPL regional managers find the information they need. The needs within two SPL regions are successfully addressed via a dashboard tool prototype. As a result of the findings, public libraries may find it easier to keep up with changing neighborhood demographics by developing reproducible data analytic techniques. Limitations include the case study's specialization to the SPL system, which may restrict generalizability even though the paper provides insightful information. Furthermore, issues concerning the timeliness and accuracy of publicly available demographic data may affect how broadly applicable these approaches are in various public library environments. [14] The goal of the project is to increase the efficiency of library book borrowing by developing an automated book sorter that makes use of RFID The device’s great precision and quick reaction time when processing book numbers and operating equipment are demonstrated through practical operation. Effective book classification Even though the automated book sorter works well, there could be issues with managing different book sizes or fixing system Research on Automatic Recognition Technology of Library Books… Informatica 48 (2024) 29–40 31 technology and a single-chip microprocessor control system. The sorter uses an AC motor to power a conveyor belt, recognizes electronic tags using RFID and a microcontroller, and precisely places books in recycling bins. is made possible by the combination of RFID technology and a single-chip microcomputer, which improves library operations and improves student learning outcomes. errors. It might take constant improvement and adjustment to maximize performance in different library circumstances. [15] Using Particle Image Velocimetry (PIV) to assess multiphase flows, the study highlights the importance of appropriate image segmentation for precise phase dynamics separation. Triangular meshing and particle detection are used in the suggested approach to provide reliable phase separation and interface detection. The new technique uses a 2D unstructured mesh for phase separation and interface detection and successfully identifies tracer particles by utilizing seeding density differentiation between phases. The effectiveness of the method is demonstrated by parametric analysis on synthetic images, and successful tracking of complicated interface evolution is revealed by experimental application in immiscible multiphase flow in porous media. While there is promise in the method, more validation under various experimental conditions is required to verify its wider application, as potential constraints may develop in other flow scenarios. 1.1 Problem statement There are problems with misplacing books since the current library book management systems are not effective in identifying books. By using image processing technologies, this research proposes an automatic identification technique to address the issue. Accurate character recognition, post-processing analysis, and bookcase detection are the system's main goals. Improvements in orderliness, real-time book tracking, and user-friendly reading environments are all part of the plan to improve library management. 2 Research on image segmentation technology and retrieval technology 2.1 Image segmentation technology General model of image segmentation According to the basic concept of image segmentation, the known image segmentation is the process of dividing an image into several regions. This segmented area is a combination of pixels with common numerical features. For example, different objects occupy different areas of the image, and the image and background objects occupy different areas of the image. The segmented regions must have sufficient homogeneity and connectivity to be segmented together[8]. Let F be the collection of all the image's pixels G, and the hypothesis about the uniformity of  n j j F S 1 = = is P (.)[3]. The mathematical explanation of the above situation is shown by formulas (1), (2), (3), (4):  n j j F S 1 = = (1) ) ( , j i S S j i   =  (2) ) ( , TRUE ) ( j S P j  = (3) ) ( , ) ( j i FALUS S S P j i  =  (4) Summary of image segmentation methods (1) Threshold segmentation Threshold segmentation can be divided into three methods: global threshold method, multi-threshold method, and adaptive threshold method. Among them, the global threshold method is easier to apply to complete images with obvious contrast between the target and the background, especially if the level of the gray background is fixed, the effect is much better. 32 Informatica 48 (2024) 29–40 H. Xun The multi-threshold method is suitable for images with different target types and background areas, and different thresholds can achieve different targets. The main core meaning of multi-threshold is to set multiple thresholds so that for images with different target types and background areas, multiple grayscales in each target and background area in the image can be compared, which enables more accurate detection. The gray value of the pixel, and more accurately segment the image. The adaptive threshold will be adjusted according to the change in the gray background level, and the background from the target will also change. If the threshold is fixed, the target effect of this kind of image acquisition is not good, and the adaptive threshold method, that is, the gray threshold obtained under different image conditions is also different, and it may have a better segmentation effect [9]. The essence of image segmentation is to split a complete image into multiple small images, and the basis for splitting into small images is based on the feature attributes of each pixel in the image, the pixel points that are closer and have similar feature attributes are grouped into a small image, and then these pixel points are used as the center to expand outward to find similar pixel points, and if similar ones are found, they can be fused into the small image, and so on until the whole image is split. The global threshold method is suitable for the complete image with obvious contrast between the target and the background, but it has certain disadvantages when it is used to segment the image where the contrast between the target and the background is not obvious and incomplete. The disadvantage of incomplete images is that some of the pixel point properties in the image are changed or lost, the histogram of these incomplete images is generally close to a single peak, and the values derived from traditional thresholding are generally not at the bottom of the valley in the histogram, so if the incomplete images are then segmented using traditional thresholding segmentation methods cannot get the correct threshold values and thus also cannot segment the images reasonably. When the image threshold is segmented, the segmentation of the target and the background is based on comparing the gray pixel value of the position with the specified threshold. The classic method of selecting a threshold is the maximum between-class variance method, which is used alone to segment a single threshold. The principle is to maximize the difference between the two parts of the classroom by setting a threshold. (2) Edge detection The edge is one of the important components of the image, so edge detection must be performed when image segmentation. The edge of the image is generally between the target image and the background. If the edge information in the image can be obtained during image segmentation, this can improve the efficiency of image segmentation [10]. The gradient operators of the commonly used edge detection algorithms include the Robert operator, Sobel operator, Prewitt operator, Laplace operator, etc. Roberts’s edge detection operator has two pairs of detection operators, which are the detection operators in the vertical direction and the diagonal direction. These two pairs of detection operators detect the difference between two adjacent pixels in their respective directions at the same time. It can be seen that the Roberts edge detection operator can detect the edges of the image more accurately, but the Roberts edge detection operator is more sensitive to the noise generated in image segmentation, so it is not suitable for image segmentation with blurred edges and large segmentation noise [11]. The principle of the Roberts edge detection operator is to use the vertical and diagonal detection operators to detect the difference between two adjacent pixels in two directions in an image. The characteristics of the Roberts edge detection operator are as follows: Roberts is more sensitive to the noise generated by image segmentation, which makes this operator not suitable for image segmentation with fuzzy edges and more complicated segmentation. However, noise will inevitably occur in the image segmentation process, so there is a certain error in the edge of the image detected by this operator. The biggest advantage of the Sobel operator is that it has a certain ability to suppress noise when performing edge detection. This ability enables the Sobel operator to help the image segmentation more smoothly, which makes the edges of the image segmentation clearer, and it can also effectively remove false edges. The traditional Sobel edge detection operator's edge positioning and noise smoothing are contradictory. To overcome this shortcoming, people use edge detection combined with template matching to effectively adjust this contradiction. Sobel operator principle: In image processing, the difference between adjacent pixels or spaces is often used to represent image edge information. The principle of the Sobel operator is to use the corresponding image data convolution mode to perform weighted calculation and approximate calculation on discrete data. It uses the small convolution mode to perform horizontal edge detection and vertical border recognition. The edge location of the Sobel edge detection operator is contradictory to noise smoothing. To overcome this shortcoming, people use edge detection combined with template matching to effectively adjust this contradiction. The first is to increase the inclusion of the Sobel operator relative to the edge direction. Due to the different edge directions, 6 templates with different edges moving 45 degrees clockwise can be added [12]. Secondly, the Sobel operator algorithm is improved, that is, the result of the S convolution operation on the 8 modes M1-M8: S1 = a1+2a8+a7-a3-2a4-a5. So Si (1 =