Predictive Analysis of Pharmaceutical Compounds Using Kernel Naive Bayes in Clinical Informatics

Abstract

The drug classification into various needful types, improved quality clinical decisions, and more accurate support of pharmacovigilance are some areas of pharmaceutical sciences that can be transformed using Machine learning (ML). Encyclopedia of Information Systems 3rd Edition Kernel Naive Bayes for Drug Classification. This study describes a Kernel Naive Bayes (KNB) model for drug classification based on a wide variety of pharmacological and therapeutic properties. From drug product data repository, this model integrates at least fundamental drug-related features, such as dosage forms, routes of administration, adverse reactions, interactions, and indications for use, which are considered as basic elements in pharmaceutical research and clinical pharmacy. It uses a Gaussian kernel to model continuous variables and a Multivariate Multinomial (MVMN) distribution to model categorical features — which allows for a more complex relationship among the features. To improve interpretability and mitigate noise, irrelevant or sparse attributes (i.e., regulatory codes, precautionary labels) were excluded. The last model attained an accuracy of 83.2% along with a prediction speed of ~1600 observations/sec proving its potential in handling large-scale pharmaceutical data effectively and efficiently. These results support the relevance of kernel-based probabilistic models in pharmacy-related issues, especially in drug safety screening, automated classification, and pharmacological data mining.

Country : Iraq

1 Marwa Mawfaq Mohamedsheet Al-Hatab2 Mohamedshet Mwfq Mohmdsht3 Murtadha A. Salim4 Hussein M. Gatea5 Ghaith Z. Ihsan6 Muataz Z. Ahmed7 Ibrahim M. Hussein8 Omar A. Abdullah9 Alaq M. Zaki10 Wameedh R. Fathel

  1. Technical Engineering College, Northern Technical University, Mosul, Iraq
  2. Middle East University, Faculty of Pharmacy, Amman, Jordan
  3. Middle East University, Faculty of Pharmacy, Amman, Jordan
  4. Middle East University, Faculty of Pharmacy, Amman, Jordan
  5. Technical Engineering College, Northern Technical University, Mosul, Iraq
  6. Technical Engineering College, Northern Technical University, Mosul, Iraq
  7. Technical Engineering College, Northern Technical University, Mosul, Iraq
  8. Technical Engineering College, Northern Technical University, Mosul, Iraq
  9. College of Dentistry, University of Mosul, Iraq
  10. Ministry of Education, General Directorate of Education in Nineveh, Iraq

IRJIET, Volume 9, Issue 5, May 2025 pp. 88-97

doi.org/10.47001/IRJIET/2025.905012

References

  1. A.S. Kotsiantis, I. Zaharakis, and P. Pintelas, “Machine learning: a review of classification and combining techniques,” Artificial Intelligence Review, vol. 26, no. 3, pp. 159–190, 2006.
  2. S. S. Bharati, P. Podder, and D. J. Lee, “Machine learning in pharmaceutical industry: applications and trends,” Applied Sciences, vol. 10, no. 17, p. 5701, 2020.
  3. I.Goodfellow, Y. Bengio, and A. Courville, Deep Learning, Cambridge, MA: MIT Press, 2016.
  4. G. E. Hinton and R. R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, vol. 313, no. 5786, pp. 504–507, 2006.
  5. D. J. Spiegelhalter, “Bayesian methods in health-related research,” Statistical Science, vol. 8, no. 4, pp. 356–383, 1993.
  6. F. Doshi-Velez and B. Kim, “Towards a rigorous science of interpretable machine learning,” arXiv preprint arXiv:1702.08608, 2017.
  7. M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you?” Explaining the predictions of any classifier,” in Proc. 22nd ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, San Francisco, CA, 2016, pp. 1135–1144.
  8. N. Bansal, S. Singh, and A. Arora, “Drug classification using machine learning and data mining techniques,” International Journal of Pharmaceutical Sciences and Research, vol. 10, no. 4, pp. 1642–1649, 2019.
  9. C. M. Bishop, Pattern Recognition and Machine Learning, New York: Springer, 2006.
  10. Z. Wang, S. Wang, and J. M. Hu, “AI in drug discovery and development: current applications and future perspectives,” Journal of Pharmaceutical Innovation, vol. 16, no. 3, pp. 364–376, 2021.
  11. M. M. Al-Hatab, A. Thamer, A. R. H. Al-Jader, and E. Younis, "Healthcare Monitoring COVID-19 Patients Based on IoT System," Revista Bionatura, vol. 8, no. CSS 4, pp. 1-11, Oct. 2023, doi: 10.21931/RB/CSS/2023.08.04.24.
  12. R. R. O. Al-Nima, M. M. M. Al-Hatab, and M. A. Qasim, "An artificial intelligence approach for verifying persons by employing the deoxyribonucleic acid (DNA) nucleotides," Journal of Electrical and Computer Engineering, vol. 2023, no. 1, Art. no. 6678837, 2023.
  13. M. A. Malla, O. H. Al-Beaka, D. M. Hameed, M. M. M. Al-Hatab, R. O. Al-Nima, M. S. Jarjees, and K. A. K. Al-Maqsood, "Adopting Machine Learning to Automatically Identify a Suitable Surgery Type for Refractive Error Patients," Jurnal Kejuruteraan, vol. 36, no. 4, pp. 1749-1757, 2024.
  14. M. A. Al-Hashim, W. R. Fathel, H. D. Ali, and M. M. M. Al-Hatab, “Enhanced Non-Invasive Blood Glucose Monitoring System Employing Wearable Optical Technology, " FPA J. Eng. Sci. ", vol. 19, no. 1, pp. 1-10, Jan. 2025, doi: https://doi.org/10.54216/FPA.190101.
  15. M. M. M. Al-Hatab, A. S. I. Al-Obaidi, and M. A. Al-Hashim, "Exploring CIE lab color characteristics for skin lesion images detection: a novel image analysis methodology incorporating color-based segmentation and luminosity analysis," Fusion: Practice and Applications, vol. 15, no. 1, pp. 88-97, 2024.
  16. A.Johny, 2023, “Comprehensive Drug Information Dataset,” Kaggle. [Online]. Available: https://www.kaggle.com/datasets/anoopjohny/comprehensive-drug-information-dataset
  17. C. Selvaraj, I. Chandra, and S. K. Singh, "Artificial intelligence and machine learning approaches for drug design: Challenges and opportunities for the pharmaceutical industries," Molecular Diversity, pp. 1–21, 2022.
  18. S. Glamocak, Feature importance in imbalanced binary classification with ensemble methods, Doctoral dissertation, Technische Universität Wien, 2024.
  19. J. Su, D. A. Knowles, and R. Rabadan, "Disentangling interpretable factors with supervised independent subspace principal component analysis," in Advances in Neural Information Processing Systems, vol. 37, pp. 37408–37438, 2024.
  20. S. S. Bafjaish, "Comparative analysis of Naive Bayesian techniques in health-related for classification task," Journal of Soft Computing and Data Mining, vol. 1, no. 2, pp. 1–10, 2020.