Penerapan Merger Retriever pada Question Answering System Hadits

  • Rozi Saputra Universitas Islam Negeri Sultan Syarif Kasim Riau
  • Nazaruddin Safaat Harahap
  • Novriyanto
  • Muhammad Affandes
Keywords: Question Answering System, LangChain, Gemini-pro, Merger Retriever, Large Language Model

Abstract

In today's digital world, information technology has had a significant impact on various aspects of life, including the study and exploration of Hadiths. This research aims to develop a Hadith question answering system that enables users to obtain quick and accurate answers to Hadith-related questions. By utilizing LangChain and Gemini-pro models, and implementing the merger retriever method, this system provides an innovative solution for searching Hadiths from the nine main books (kutubut tis’ah). LangChain and Gemini-pro ensure that the generated answers are not only relevant but also accurate, utilizing Large Language Model (LLM) technology. Additionally, the implementation of the merger retriever enhances the system's performance in finding the most suitable answers from various Hadith sources simultaneously. This research involves a series of stages ranging from collecting Hadith data, system analysis and design, implementation, to system testing. The test results show that the Hadith question answering system can produce relevant and accurate answers, with expert Hadith respondents evaluating the quality of answers averaging 88.5% satisfaction with the "Strongly Agree" rating interval, and LangChain evaluator scoring averaging 91% within the "Strongly Agree" category interval. This emphasizes the significant potential of applying information technology in the study of Hadiths, and opens opportunities for further development in this field. This research is expected to make a significant contribution to the ease of access to accurate and fast Hadith information for the Muslim community.

References

Akter, S. N., Yu, Z., Muhamed, A., Ou, T., Bäuerle, A., Cabrera, Á. A., Dholakia, K., Xiong, C., & Neubig, G. (2023). An In-depth Look at Gemini’s Language Abilities. http://arxiv.org/abs/2312.11444
Alami, H., El Mahdaouy, A., Benlahbib, A., En-Nahnahi, N., Berrada, I., & Ouatik, S. E. A. (2023). DAQAS: Deep Arabic Question Answering System based on duplicate question detection and machine reading comprehension. Journal of King Saud University - Computer and Information Sciences, 35(8). https://doi.org/10.1016/j.jksuci.2023.101709
Andariati, L. (2020). HADIS DAN SEJARAH PERKEMBANGANNYA. Diroyah: Jurnal Ilmu Hadis, 4(2), 153–166.
Arefeen, M. A., Debnath, B., & Chakradhar, S. (2023). LeanContext: Cost-Efficient Domain-Specific Question Answering Using LLMs. ArXiv, 1–8. http://arxiv.org/abs/2309.00841
Choi, W., Choi, T., & Heo, S. (2023). A Comparative Study of Automated Machine Learning Platforms for Exercise Anthropometry-Based Typology Analysis: Performance Evaluation of AWS SageMaker, GCP VertexAI, and MS Azure. Bioengineering, 10(8). https://doi.org/10.3390/bioengineering10080891
Finardi, P., Avila, L., Castaldoni, R., Gengo, P., Larcher, C., Piau, M., Costa, P., & Caridá, V. (2024). The Chronicles of RAG: The Retriever, the Chunk and the Generator. http://arxiv.org/abs/2401.07883
Google Cloud. (2023, December). Introduction to Vertex AI. Google Cloud.
Hadi, R. T. (2020). STUDI APLIKASI HADIS ERA MOBILE (STUDI “SATU HARI SATU HADIS” OLEH PUSAT KAJIAN HADIS). ISLAM TRANSFORMATIF: Journal of Islamic Studies, 4(1), 13–24.
Hafidz, P., Jurusan, M., Hadis, I., Ushuluddin, F., Sunan, U., & Djati Bandung, G. (2022). ANALISIS APLIKASI ENSIKLOPEDIA HADIS 9 IMAM VERSI ANDROID SEBAGAI SARANA DAKWAH DI MEDIA SOSIAL. Jurnal Penelitian Ilmu Ushuluddin, 2(2), 312–330. https://doi.org/10.15575/jpiu.v2i2.14393
Hillebrand, L., Berger, A., Deußer, T., Dilmaghani, T., Khaled, M., Kliem, B., Loitz, R., Pielka, M., Leonhard, D., Bauckhage, C., & Sifa, R. (2023, August 22). Improving Zero-Shot Text Matching for Financial Auditing with Large Language Models. DocEng 2023 - Proceedings of the 2023 ACM Symposium on Document Engineering. https://doi.org/10.1145/3573128.3609344
Ioannidis, J., Harper, J., Quah, M. S., & Hunter, D. (2023, June 19). Gracenote.ai: Legal Generative AI for Regulatory Compliance. Proceedings of the Third International Workshop on Artificial Intelligence and Intelligent Assistance for Legal Professionals in the Digital Workplace.
Kasenchak, R. T. (2019). What is Semantic Search? and why is it important? Information Services and Use, 39(3), 205–213. https://doi.org/10.3233/ISU-190045
LangChain. (2023). LOTR (Merger Retriever) | ????️???? Langchain. LangChain. https://python.langchain.com/docs/integrations/retrievers/merger_retriever
LangChain. (2024). Scoring Evaluator. LangChain.
Leippold, M. (2023). Thus spoke GPT-3: Interviewing a large-language model on climate finance. Finance Research Letters, 53. https://doi.org/10.1016/j.frl.2022.103617
Nigam, S. K., Mishra, S. K., Mishra, A. K., Shallum, N., & Bhattacharya, A. (2023). Legal Question-Answering in the Indian Context: Efficacy, Challenges, and Potential of Modern AI Models. ArXiv, 1–15. http://arxiv.org/abs/2309.14735
Ni’mah, A. T., & Arifin, A. Z. (2020). Perbandingan Metode Term Weighting terhadap Hasil Klasifikasi Teks pada Dataset Terjemahan Kitab Hadis. Rekayasa, 13(2), 172–180. https://doi.org/10.21107/rekayasa.v13i2.6412
Pesaru, A., Gill, T. S., & Tangella, A. R. (2023). AI assistant for document management Using Lang Chain and Pinecone. International Research Journal of Modernization in Engineering Technology and Science, 5(6), 3980–3983. https://doi.org/10.56726/irjmets42630
Qi, P., Lee, H., Manning, C. D., & Sido, O. “TG.” (2021). Answering Open-Domain Questions of Varying Reasoning Steps from Text. Proceedings Ofthe 2021 Conference on Empirical Methods in Natural Language Processing, 3599–3614. https://beerqa.github.io/
Sabbeh, S. F., & Fasihuddin, H. A. (2023). A Comparative Analysis of Word Embedding and Deep Learning for Arabic Sentiment Classification. Electronics (Switzerland), 12(6). https://doi.org/10.3390/electronics12061425
Șorecău, M., & Șorecău, E. (2023). An Alternative Application to CHATGPT that Uses Reliable Sources to Enhance the Learning Process. International Conference KNOWLEDGE-BASED ORGANIZATION, 29(3), 113–119. https://doi.org/10.2478/kbo-2023-0084
Team, G. (2024). Gemini: A Family of Highly Capable Multimodal Models.
Tejaswini, N., S, V., & Kumar, V. (2023). LangChain-Powered Virtual Assistant for PDF Communication. International Research Journal of Modernization in Engineering Technology and Science. https://doi.org/10.56726/irjmets43587
Topsakal, O., & Akinci, T. C. (2023). Creating Large Language Model Applications Utilizing LangChain: A Primer on Developing LLM Apps Fast. 5th International Conference on Applied Engineering and Natural Sciences, 1050–1056. http://as-proceeding.com/:Konya,Turkeyhttps://www.icaens.com/
Ummah, S. S. (2019). DIGITALISASI HADIS (Studi Hadis di Era Digital). Diroyah: Jurnal Ilmu Hadis, 4(1), 1–10.
Published
2024-06-10
How to Cite
Saputra, R., Harahap, N. S., Novriyanto, & Affandes, M. (2024). Penerapan Merger Retriever pada Question Answering System Hadits. SATIN - Sains Dan Teknologi Informasi, 10(1), 24-35. https://doi.org/10.33372/stn.v10i1.1117