TITLE:
Privacy-Preserving LLM Integration with Scientific NoSQL Repositories: A Differential Privacy Approach
AUTHORS:
Tanmoy Biswas
KEYWORDS:
Differential Privacy, Large Language Models, NoSQL, R&D Data Security, Scientific Documentation, Privacy-Preserving NLP
JOURNAL NAME:
World Journal of Engineering and Technology,
Vol.13 No.2,
May
27,
2025
ABSTRACT: As the integration of Large Language Models (LLMs) into scientific R&D accelerates, the associated privacy risks become increasingly critical. Scientific NoSQL repositories, which often store sensitive experimental documentation, must be protected from data leakage and inference attacks. This paper proposes a novel privacy-preserving architecture that enables LLM-based querying, summarization, and guidance over scientific NoSQL datasets under differential privacy (DP) constraints. We introduce a comprehensive framework that includes local sensitivity analysis, DP-calibrated query transformation, privacy-aware embeddings, and a controlled interface for LLM interactions. Our experiments on synthetic and biomedical datasets demonstrate the trade-offs between privacy budgets and semantic utility. This work bridges the gap between secure data infrastructure and intelligent scientific interfaces, paving the way for compliant and interpretable AI deployments in research settings.