1795 KB0 KB201420262025Banjanović-Mehmedović, LejlaGurdić-Ribić, A.Husaković, A.Karabegović, IsakPrljača, Naserštevilka:1letnik:20str. 5-17DOI:10.14743/apem2025.1.523ISSN:1854-6250COBISSID_HOST:264960259URN:URN:NBN:SI:doc-V6OH4GKAenFakulteta za strojništvo, Inštitut za proizvodno strojništvoAdvances in production engineering and managementConservative Q-learningdeep reinforcement learninghuman-robot collaborationinterakcija človek-robotmanipulatorjirobot learningrobot manipulation tasksrobotikarobotsko učenjesoft actor-critic algorithmstrojno učenjeReinforcement learning for robot manipulation tasks in human-robot collaboration using the CQL/SAC algorithms|The integration of human-robot collaboration (HRC) into industrial and service environments demands efficient and adaptive robotic systems capable of executing diverse tasks, including pick-and-place operations. This paper investigates the application of Soft Actor-Critic (SAC) and Conservative Q-Learning (CQL)—two deep reinforcement learning (DRL) algorithms—for the learning and optimization of pick-and-place actions within HRC scenarios. By leveraging SAC’s capability to balance exploration and exploitation, the robot autonomously learns to perform pick-and-place tasks while adapting to dynamic environments and human interactions. Moreover, the integration of CQL ensures more stable learning by mitigating Q-value overestimation, which proves particularly advantageous in offline and suboptimal data scenarios. The combined use of CQL and SAC enhances policy robustness, facilitating safer and more efficient decision-making in continually evolving environments. The proposed framework combines simulation-based training with transfer learning techniques, enabling seamless deployment in real-world environments. The critical challenge of trajectory completion is addressed through a meticulously designed reward function that promotes efficiency, precision, and safety. Experimental validation demonstrates a 100 % success rate in simulation and an 80 % success rate on real hardware, confirming the practical viability of the proposed model. This work underscores the pivotal role of DRL in enhancing the functionality of collaborative robotic systems, illustrating its applicability across a range of industrial environmentsTEXTznanstveno časopisjejournalsSlovenian National E-content AggregatorNational and University Library of SloveniaUniverza v Mariboru, Fakulteta za strojništvo