Articles | Open Access | DOI: https://doi.org/10.37547/tajiir/Volume08Issue02-11

Engineering Trust in AI Systems: A Data-Layer Framework for Explainability and Auditability

Mohammed Arbaaz Shareef , Lead Data Engineer at Anblicks

Abstract

The article examines engineering approaches for strengthening trust in AI systems through data-layer controls that make decisions explainable and verifiable in audits. The widening regulatory and organizational demands for the traceability of training data, the reproducibility of pipelines, and the defensible documentation of model behavior in production drive practical relevance. Scientific novelty lies in integrating provenance capture, lineage graphs, feature-store governance, and standardized documentation artifacts into one coherent data-layer framework that produces machine-checkable evidence. The work describes lifecycle evidence generation from ingestion to inference, studies how provenance models and lineage datasets support inspection, and analyzes how documentation instruments complement technical traces. Special attention is given to preventing evidence gaps caused by opaque preprocessing, weak versioning, and incomplete logging. The study aims to systematize a data-layer architecture that supports explainability and auditability without relying on new model classes. A comparative analysis of recent research, a synthesis of published frameworks, and a structured review of sources are employed in this study. The conclusion summarizes actionable controls and their expected audit outputs. The article targets engineers, MLOps teams, risk functions, and internal/external auditors.

Keywords

trustworthy AI, explainability, auditability, data provenance, data lineage, feature store, model documentation, traceability, metadata governance, reproducible pipelines.

References

Ahmed, M., Dar, A. R., Helfert, M., Khan, A., & Kim, J. (2023). Data provenance in healthcare: Approaches, challenges, and future directions. Sensors, 23(14), 6495. https://doi.org/10.3390/s23146495

Chen, Y., Zhao, Y., Li, X., Zhang, J., Long, J., & Zhou, F. (2024). An open dataset of data lineage graphs for data governance research. Visual Informatics, 8(1), 1–5. https://doi.org/10.1016/j.visinf.2024.01.001

de la Rúa Martínez, J., Buso, F., Kouzoupis, A., Ormenisan, A. A., Niazi, S., Bzhalava, D., Mak, K., Jouffrey, V., Ronström, M., Cunningham, R., Zangis, R., Mukhedkar, D., Khazanchi, A., Vlassov, V., & Dowling, J. (2024). The Hopsworks feature store for machine learning. In Companion of the 2024 International Conference on Management of Data (SIGMOD ’24) (pp. 135–147). Association for Computing Machinery. https://doi.org/10.1145/3626246.3653389

Gilbert, S., Adler, R., Holoyad, T., & Weicken, E. (2025). Could transparent model cards with layered accessible information drive trust and safety in health AI? npj Digital Medicine, 8(1), 124. https://doi.org/10.1038/s41746-025-01482-9

Kalokyri, V., Tachos, N. S., Kalantzopoulos, C. N., Sfakianakis, S., Kondylakis, H., Zaridis, D. I., Colantonio, S., Regge, D., Papanikolaou, N., Marias, K., Fotiadis, D. I., Tsiknakis, M., & (2025). AI model passport: Data and system traceability framework for transparent AI in health. Computational and Structural Biotechnology Journal, 28, 386–404. https://doi.org/10.1016/j.csbj.2025.09.041

Liu, R., Park, K., Psallidas, F., Zhu, X., Mo, J., Sen, R., Interlandi, M., Karanasos, K., Tian, Y., & Camacho-Rodríguez, J. (2023). Optimizing data pipelines for machine learning in feature stores. Proceedings of the VLDB Endowment, 16(13), 4230–4239. https://doi.org/10.14778/3625054.3625060

Longpre, S., Mahari, R., Chen, A., et al. (2024). A large-scale audit of dataset licensing and attribution in AI. Nature Machine Intelligence, 6, 975–987. https://doi.org/10.1038/s42256-024-00878-8

Mökander, J., Schuett, J., Kirk, H. R., et al. (2024). Auditing large language models: A three-layered approach. AI Ethics, 4, 1085–1115. https://doi.org/10.1007/s43681-023-00289-2

Schlegel, M., & Sattler, K.-U. (2025). Capturing end-to-end provenance for machine learning pipelines. Information Systems, 132, 102495. https://doi.org/10.1016/j.is.2024.102495

Staufer, L., Yang, M., Reuel, A., & Casper, S. (2025). Audit cards: Contextualizing AI evaluations (arXiv:2504.13839). arXiv. https://arxiv.org/abs/2504.13839

Download and View Statistics

Views: 0   |   Downloads: 0

Copyright License

Download Citations

How to Cite

Shareef, M. A. (2026). Engineering Trust in AI Systems: A Data-Layer Framework for Explainability and Auditability. The American Journal of Interdisciplinary Innovations and Research, 8(2), 83–89. https://doi.org/10.37547/tajiir/Volume08Issue02-11