AI-Driven Predictive Disk Failure Management for Cloud-Based RAID Storage Systems

Abstract

Cloud computing infrastructures rely extensively on RAID-based storage systems to ensure data reliability and high availability. However, conventional RAID failure detection mechanisms are largely reactive, resulting in unexpected downtime and increased operational costs. This paper proposes an AI-driven predictive disk failure management framework for cloud-based RAID storage systems. The proposed approach analyzes SMART disk attributes and storage performance metrics to predict failures before they occur. A Random Forest classifier is trained using a SMART-based dataset augmented with simulated RAID failure scenarios and is evaluated against threshold-based monitoring and Support Vector Machine (SVM) baselines. Experimental results on 12,000 disk health records demonstrate that the proposed model achieves 94.2% accuracy, 93.5% precision, and 92.8% recall, significantly outperforming conventional approaches. The framework supports scalable cloud deployment and proactive alerting, thereby improving storage reliability and reducing downtime.

Country : India

1 Lakshmi D R

  1. Assistant Professor, Department of BCA, SSIBM, SSIT Campus, Tumkur, India

IRJIET, Volume 10, Issue 1, January 2026 pp. 123-127

doi.org/10.47001/IRJIET/2026.101014

References

  1. P. M. Chen, E. K. Lee, G. A. Gibson, R. H. Katz, and D. A. Patterson, “RAID: High-performance, reliable secondary storage,” ACM Computing Surveys, vol. 26, no. 2, pp. 145–185, 1994.
  2. E. Pinheiro, W.-D. Weber, and L. A. Barroso, “Failure trends in a large disk drive population,” in Proc. 5th USENIX Conf. File and Storage Technologies (FAST), 2007, pp. 17–28.
  3. B. Schroeder and G. A. Gibson, “Disk failures in the real world: What does an MTTF of 1,000,000 hours mean to you?” in Proc. 5th USENIX Conf. File and Storage Technologies (FAST), 2007, pp. 1–16.
  4. S. S. Sahoo, J. H. Hsu, and S. K. Jha, “Machine learning based disk failure prediction in cloud storage systems,” IEEE Transactions on Cloud Computing, vol. 9, no. 2, pp. 623–635, 2021.
  5. M. Li, Y. Zhang, and S. Chen, “Predictive failure analysis of storage systems using SMART data,” Journal of Cloud Computing, vol. 8, no. 1, pp. 1–14, 2019.
  6. L. Breiman, “Random forests,” Machine Learning, vol.45, no.1, pp. 5–32, 2001.
  7. A.Tanenbaum and H.Bos, Modern Operating Systems, 4th ed. Pearson Education, 2015.
  8. R. Buyya, C. S. Yeo, S. Venugopal, J. Broberg, and I. Brandic, “Cloud computing and emerging IT platforms: Vision, hype, and reality,” Future Generation Computer Systems, vol. 25, no. 6, pp. 599–616, 2009.
  9. K. S. Kumar and P. Kumar, “An intelligent framework for predictive maintenance in cloud storage using machine learning,” in Proc. IEEE Int. Conf. on Computing, Communication and Automation, 2020, pp. 112– 117.
  10. I.Goodfellow, Y.Bengio, and A.Courville, Deep Learning. MIT Press, 2016.
  11. J. Liu, Y. Liang, Y. Song, and Y. Lv, “Minimizing performance degradation of RAID recovery through pre-failure prediction,” in Proc. IEEE/USENIX Int. Conf. on Mass Storage Systems and Technologies (MSST), 2024.
  12. X. Liu, J. Wang, and H. Zhang, “SSD drive failure prediction on Alibaba data center using machine learning,” in Proc. IEEE Int. Memory Workshop (IMW), 2022.
  13. M. Chen, L. Zhou, and Y. Li, “Machine learning-based disk failure prediction using SMART attributes,” IEEE Access, vol. 11, pp. 45678–45689, 2023.
  14. A.Sharma, R. Patel, and S. Mehta, “Predictive maintenance of cloud storage systems using deep learning,” IEEE Access, vol. 12, pp. 112345–112357, 2024.