Privacy-Preserving Record Linkage: A Survey of Key Concepts and Approaches

Abstract

In today’s era massive data sets having large and complex structure with the difficulties of storing, analysing and visualizing for further processes or results. The voluminous data, especially personal data in multiple sources, present large opportunities and insight for businesses for analysis and investigating the value of linked and integrated data. Privacy is a major concern while we share or link data through networks of different organizations. Privacy Preserving Record Linkage (PPRL) aims to address this problem by identifying and linking records that correspond to the same real world entity across several data sources held by different parties without revealing any sensitive information about these entities. Data deduplication is intelligent comparison or single instance storage. It is a process that eliminates redundant copies of data and reduces storage overhead. In this article, we provide an overview of the research literature in privacy-preserving record linkage, discuss the different types of techniques that have been proposed. We conclude this work with an overview of PPRL techniques.

Country : India

1 Patel Krupali2 Dr. Prashant Pittalia

  1. Prof. V. B. Shah Institute of Management, R.V. Patel College of Commerce (English Medium), V. L. Shah College of Commerce (Gujarati Medium) and Sutex Bank College of Computer Applications & Science, Surat, Gujarat, India
  2. Department of Computer Science, Sardar Patel University, Vallabh Vidyanagar, Gujarat, India

IRJIET, Volume 10, Issue 1, January 2026 pp. 66-73

doi.org/10.47001/IRJIET/2026.101008

References

  1. S. J. Petersen, R. D. Lieberthal, K. J. Miller, and N. H. Vakil, “Privacy Preserving Record Linkage (PPRL) Strategy and Recommendations Sponsor: National Institute on Aging PPRL Linkage Strategies Report McLean, VA,” 2023. [Online]. Available: https://www.alz.org/alzheimers-dementia/facts-figures.
  2. M. Ostermann, I. Nesterow, and M. Wolfien, “A Hybrid-Approach for Privacy Preserving Record Linkage - A Case Study from Germany,” Stud Health Technol Inform, vol. 316, pp. 43–47, Aug. 2024, doi: 10.3233/SHTI240340.
  3. D. Vatsalan, P. Christen, C. O’keefe, and V. S. Verykios, “An Evaluation Framework for Privacy-Preserving Record Linkage,” 2014. [Online]. Available: http://repository.cmu.edu/jpc
  4. P. Christen, “Privacy-Preserving Data Linkage and Geocoding: Current Approaches and Research Directions.”
  5. T. Churches and P. Christen, “Some methods for blindfolded record linkage,” BMC Med Inform DecisMak, vol. 4, Jun. 2004, doi: 10.1186/1472-6947-4-9.
  6. M. Franke, V. Christen, P. Christen, F. Rohde, and E. Rahm, “(Privately) Estimating Linkage Quality for Record Linkage,” in Advances in Database Technology - EDBT, OpenProceedings.org, Nov. 2023, pp. 294–306. doi: 10.48786/edbt.2024.26.
  7. A.P. Brown, A. M. Ferrante, S. M. Randall, J. H. Boyd, and J. B. Semmens, “Ensuring privacy when integrating patient-based datasets: New methods and developments in record linkage,” Front Public Health, vol. 5, no. MAR, Mar. 2017, doi: 10.3389/FPUBH.2017.00034.
  8. S. B. T. G. , B. B. , S. G. E. , T. M. , E. T. , K. N. S. L. , N. C. , M. A. N. T. , D. L. and R. H. S. Alisia Southwell1, “2022-Validating a novel deterministic privacy-preserving record linkage between,” Int J Popul Data Sci, Nov. 2022.
  9. T. Ranbaduge, D. Vatsalan, and M. Ding, “Privacy-preserving Deep Learning based Record Linkage,” IEEE Trans Knowl Data Eng, 2023, doi: 10.1109/TKDE.2023.3342757.
  10. K. Schmidlin, K. M. Clough-Gorr, and A. Spoerri, “Privacy Preserving Probabilistic Record Linkage (P3RL): A novel method for linking existing health-related data and maintaining participant confidentiality,” BMC Med Res Methodol, vol. 15, no. 1, May 2015, doi: 10.1186/s12874-015-0038-6.
  11. S. Han, Z. Wang, D. Shen, and C. Wang, “A Parallel Multi-Party Privacy-Preserving Record Linkage Method Based on a Consortium Blockchain,” Mathematics, vol. 12, no. 12, p. 1854, Jun. 2024, doi: 10.3390/math12121854.
  12. S. Vaiwsri, T. Ranbaduge, and P. Christen, “Encryption-based sub-string matching for privacy preserving record linkage,” 2024.
  13. S. Han, Y. Wang, D. Shen, and C. Wang, “A Multi-Party Privacy-Preserving Record Linkage Method Based on Secondary Encoding,” Mathematics, vol. 12, no. 12, p. 1800, Jun. 2024, doi: 10.3390/math12121800.
  14. S. Lin, E. Kolaczyk, and E. D. Kolaczyk, “NBER WORKING PAPER SERIES DATA PRIVACY FOR RECORD LINKAGE AND BEYOND Data Privacy for Record Linkage and Beyond,” 2024. [Online]. Available: http://www.nber.org/papers/w32940.
  15. A.Pathak et al., “Privacy preserving record linkage for public health action: opportunities and challenges,” Journal of the American Medical Informatics Association, Nov. 2024, doi: 10.1093/jamia/ocae196.
  16. A.Vidanage, P. Christen, T. Ranbaduge, and R. Schnell, “A Vulnerability Assessment Framework for Privacy-preserving Record Linkage,” ACM Transactions on Privacy and Security, vol. 26, no. 3, Jun. 2023, doi: 10.1145/3589641.
  17. N. Wu, D. Vatsalan, M. A. Kaafar, and S. K. Ramesh, “Privacy-Preserving Record Linkage for Cardinality Counting,” Jan. 2023, [Online]. Available: http://arxiv.org/abs/2301.04000.
  18. V. Christen, T. Häntschel, P. Christen, and E. Rahm, “Privacy-preserving record linkage using autoencoders,” Int J Data Sci Anal, vol. 15, no. 4, pp. 347–357, May 2023, doi: 10.1007/s41060-022-00377-2.
  19. T. Nóbrega, C. Eduardo S. Pires, and D. Cassimiro Nascimento, “Towards Auditable and Intelligent Privacy-Preserving Record Linkage,” Sociedade Brasileira de Computacao - SB, Oct. 2023, pp. 270–284. doi: 10.5753/sbbd_estendido.2023.232442.
  20. F. Armknecht, Y. Heng, and R. Schnell, “Strengthening Privacy-Preserving Record Linkage using Diffusion,” Proceedings on Privacy Enhancing Technologies, vol. 2023, no. 2, pp. 298–311, Apr. 2023, doi: 10.56553/popets-2023-0054.
  21. S. Randall et al., “A blinded evaluation of privacy preserving record linkage with Bloom filters,” BMC Med Res Methodol, vol. 22, no. 1, Dec. 2022, doi: 10.1186/s12874-022-01510-2.
  22. S. Vaiwsri, T. Ranbaduge, and P. Christen, “Accurate and efficient privacy-preserving string matching,” Int J Data Sci Anal, vol. 14, no. 2, pp. 191–215, Aug. 2022, doi: 10.1007/s41060-022-00320-5.
  23. S. I. Khan, A. B. A. Khan, and A. S. M. L. Hoque, “Privacy preserved incremental record linkage,” J Big Data, vol. 9, no. 1, Dec. 2022, doi: 10.1186/s40537-022-00655-7.
  24. S. Vaiwsri, T. Ranbaduge, and P. Christen, “Accurate and efficient privacy-preserving string matching,” Int J Data Sci Anal, vol. 14, no. 2, pp. 191–215, Aug. 2022, doi: 10.1007/s41060-022-00320-5.
  25. T. Nóbrega, C. E. S. Pires, and D. C. Nascimento, “Blockchain-based Privacy-Preserving Record Linkage: enhancing data privacy in an untrusted environment,” Inf Syst, vol. 102, Dec. 2021, doi: 10.1016/j.is.2021.101826.
  26. S. Vaiwsri, T. Ranbaduge, P. Christen, and K. S. Ng, “Accurate and Efficient Suffix Tree Based Privacy-Preserving String Matching,” Apr. 2021, [Online]. Available: http://arxiv.org/abs/2104.03018.
  27. N. Shekokar and V. M. Shelake, “An Enhanced Approach for Privacy Preserving Record Linkage during Data Integration,” in 2020 6th IEEE International Conference on Information Management, ICIM 2020, Institute of Electrical and Electronics Engineers Inc., Mar. 2020, pp. 152–156. doi: 10.1109/ICIM49319.2020.244689.
  28. V. Uma Rani, M. Sreenivasa Rao, and M. C. Tech Scholar in, “Detection and Privacy Preservation of Sensitive Attributes Using Hybrid Approach for Privacy Preserving Record Linkage Kotra Sai Srujana,” International Journal on Recent and Innovation Trends in Computing and Communication, 2017, [Online]. Available: http://www.ijritcc.org
  29. M. Franke, V. Christen, P. Christen, F. Rohde, and E. Rahm, “(Privately) Estimating Linkage Quality for Record Linkage,” in Advances in Database Technology - EDBT, OpenProceedings.org, Nov. 2023, pp. 294–306. doi: 10.48786/edbt.2024.26.
  30. S. Thomas and J. Sluss, “Fraud detection through data sharing using privacy-preserving record linkage, digital signature (EdDSA), and the MinHash technique: Detect fraud using privacy preserving record links,” The Journal of Engineering, vol. 2023, no. 12, Dec. 2023, doi: 10.1049/tje2.12341.
  31. D. Vatsalan, Z. Sehili, P. Christen, and E. Rahm, “Privacy-Preserving Record Linkage for Big Data: Current Approaches and Research Challenges.” [Online]. Available: http://ec.europa.eu/justice/dataprotection/index
  32. Y. Lindell and B. Pinkas, “Secure Multiparty Computation for Privacy-Preserving Data Mining,” 2008.
  33. “2019 View of Secure Multi-Party Computation in Genomics_ Protecting Privacy While Enabling Research Collaboration”.
  34. M. Ostermann, I. Nesterow, and M. Wolfien, “A Hybrid-Approach for Privacy Preserving Record Linkage - A Case Study from Germany,” Stud Health Technol Inform, vol. 316, pp. 43–47, Aug. 2024, doi: 10.3233/SHTI240340.