Design and Development of a Contextual Search System for Qur’anic Text Based on Large Language Models (LLMs) or Artificial Intelligence (AI)

Authors

  • Lukman Hakim Husnan STIQ Al Lathifiyyah Palembang
  • Listiananda Apriliawan Airpaz Indonesia
  • Lailatul Mu'jizati STIQ Al Lathifiyyah Palembang
  • Kgs. Adlan Maghfur STIQ Al Lathifiyyah Palembang
  • Siti Alfiatun Hasanah STIQ Al Lathifiyyah Palembang

DOI:

https://doi.org/10.32923/rnfgb906

Keywords:

Large Language Model,, Contextual Search, GuardRail Prompt, Al Qur’an, Hallucination Mitigation.

Abstract

This study addresses the challenge of providing reliable, safe, and contextually accurate access to the Holy Qur’an texts using Large Language Models (LLMs). While traditional approaches often rely on Retrieval-Augmented Generation (RAG) for factual grounding, this research proposes and evaluates a novel, non RAG architectural approach centered on advanced, multi-layered GuardRail 
Prompting to manage the inherent risks of LLM stochasticity and hallucination in sensitive religious domains. The system integrates Input Guardrails for prompt injection mitigation, and critical Output Guardrails utilizing an LLM-as-a-Judge framework and a re-generation loop to validate responses against semantic relevance, Shariah compliance, and structural integrity (JSON Schema). The research adopts a Research and Development (R&D) methodology combined with a computational experimental approach to evaluate system performance, moderation effectiveness, and the functional role of rule-based control in regulating generative model outputs. Results demonstrate that a well-engineered GuardRail architecture can effectively constrain LLM behavior, achieving high 
faithfulness and relevance, with low PGR and acceptable FPR across adversarial and benign query datasets. This research establishes GuardRail Prompting as a viable and robust alternative for contextual grounding in sensitive, knowledge intensive applications where RAG deployment may be restricted or structurally undesirable. 

References

Abdul-Karim, M., Al-Sayed, H., & Nour, F. (2025). IslamicEval: A shared task for evaluating hallucination and factuality of LLMs on Islamic texts. ACL Anthology.

Alnefaie, M., Alharthi, I., & Alghamdi, A. (2024). LLMs based approach for Quranic question answering. In Proceedings of the 13th International Conference on Computer Science and Information Technology. SCITEPRESS. https://www.scitepress.org/Papers/2024/130129/130129.pdf

Alqarni, M. (2024). Embedding search for Quranic texts based on large language models. The International Arab Journal of Information Technology, 21(2), 243–256. https://doi.org/10.34028/iajit/21/2/7

Asseri, B., Abdelaziz, E., & Al-Wabil, A. (2025). Prompt Engineering Techniques for Mitigating Cultural Bias Against Arabs and Muslims in Large Language Models: A Systematic Review. arXiv preprint arXiv:2506.18199.

Bhojani, A.-R., & Schwarting, M. (2023). Truth and regret: Large language models, the Quran, and misinformation. Journal of Qur’anic Studies, advance online publication. https://www.tandfonline.com/doi/pdf/10.1080/14746700.2023.2255944

Dong, Y., Mu, R., Zhang, Y., Sun, S., Zhang, T., Wu, C., & Huang, X. (2025). Safeguarding large language models: A survey. Artificial intelligence Review, 58(12), 382.

Endtrace. (2024). Prompt engineering with guardrails: Safety-first design for LLMs. https://www.endtrace.com/prompt-engineering-with-guardrails-guide/

Erick, H. (2025). How JSON Schema works for structured outputs and tool integration. PromptLayer Blog. Retrieved October 23, 2025, from https://blog.promptlayer.com/how-json-schema-works-for- structured-outputs-and-tool-integration/

Huang, K. (2023, November 22). Mitigating security risks in Retrieval Augmented Generation (RAG) LLM applications. Cloud Security Alliance. https://cloudsecurityalliance.org/blog/2023/11/22/mitigating-security-risks-in-retrieval- augmented-generation-rag-llm-applications

Jiang, Y., Kumar, S., & Tandon, R. (2024). Effective strategies for mitigating hallucinations in large language models. arXiv. https://arxiv.org/abs/2402.01832

Liu, X., Park, J., & Carvalho, A. (2025). SoK: Evaluating jailbreak guardrails for large language models. arXiv. https://arxiv.org/abs/2506.10597

Lukose, D. (2025). Guardrails implementation best practice. Medium. Retrieved October 23, 2025, from https://medium.com/@dickson.lukose/guardrails-implementation-best-practice-e5fa2c1e4e09

Milvus. (2025). What metrics are used to evaluate the success of LLM guardrails? Retrieved October 24, 2025, from https://milvus.io/ai-quick-reference/what-metrics-are-used-to-evaluate- the-success-of-llm-guardrails

NVIDIA. (2025). Measuring the effectiveness and performance of AI guardrails in generative AI applications. NVIDIA Developer Blog. Retrieved October 24, 2025, from https://developer.nvidia.com/blog/measuring-the-effectiveness-and-performance-of-ai- guardrails-in-generative-ai-applications/

Nystrom, R. (2025). LLM guardrails: How to build reliable and safe AI applications. Neptune.ai Blog. https://neptune.ai/blog/llm-guardrails

Promptfoo. (2025). How to measure and prevent LLM hallucinations. Retrieved October 23, 2025, from https://www.promptfoo.dev/docs/guides/prevent-llm-hallucinations

Schwarting, M. (2025). To Christians developing LLM applications: A warning, and some suggestions. AI and Faith. https://aiandfaith.org/featured-content/to-christians-developing-llm-applications-a-warning-and-some-suggestions/

Shikkhaghildiyai. (2025). Context-aware RAG system with Azure AI Search to cut token costs and boost accuracy. Microsoft TechCommunity. https://techcommunity.microsoft.com/blog/azure-ai- foundry-blog/context-aware-rag-system-with-azure-ai-search-to-cut-token-costs-and-boost-accur/4456810

Suo, S. et al. (2024). Signed-Prompt: A New Approach to Prevent Prompt Injection Attacks Against LLM- Integrated Applications. arXiv:2401.07612.

Tam, T. Y. C., et al. (2024). A framework for human evaluation of large language models in healthcare derived from literature review. PubMed Central. Retrieved October 23, 2025, from https://pmc.ncbi.nlm.nih.gov/articles/PMC11437138/

Wang, X., Ji, Z., Wang, W., Li, Z., Wu, D., & Wang, S. (2025). SoK: Evaluating Jailbreak Guardrails for Large Language Models. arXiv preprint arXiv:2506.10597.

Waqar, K. M., Ibrahim, M., & Khan, M. M. I. (2025). Ethical Implications Of Artificial Intelligence: An Islamic Perspective. Journal of Religion and Society, 3(01), 347-358.

Xiong, W. et al. (2025). Defensive Prompt Patch: A Robust and Generalizable Defense of Large Language Models against Jailbreak Attacks. Findings of ACL 2025.

Zarecki, I. (2024). LLM guardrails guide AI toward safe, reliable outputs. K2View. https://www.k2view.com/blog/llm-guardrails

Zheng, T., Li, H., & Shuster, K. (2025). A survey on LLM-as-a-Judge: Capabilities, limitations, and calibration techniques. arXiv. https://arxiv.org/abs/2503.01234

Downloads

Published

2025-12-01

Issue

Section

Articles

How to Cite

Design and Development of a Contextual Search System for Qur’anic Text Based on Large Language Models (LLMs) or Artificial Intelligence (AI). (2025). Scientia: Jurnal Hasil Penelitian, 10(2), 13-27. https://doi.org/10.32923/rnfgb906