Recipe Decoder

Harshita Sonkar; Laxmi Pawar; Akanksha Puri; Rahul Gupta; Prof. Sonali Deshpande

doi:https://doi.org/10.47001/IRJIET/2025.904009

Recipe Decoder

Abstract

Automated recipe generation from food images remains challenging for diverse cuisines like Indian dishes, which involve intricate spice combinations and regional variations. This paper proposes Recipe Decoder, a multimodal system leveraging a custom EfficientNet-B4 model for dish classification and Gemini API for context-aware recipe generation, augmented by Spoonacular API for recipe exploration. Our approach addresses three key gaps: (1) accurate identification of visually similar Indian dishes (e.g., differentiating roti from kulcha), (2) culturally appropriate ingredient-to-instruction translation, and (3) real-time integration of user preferences.

The system achieves 92% validation accuracy on a dataset of 2,000 Indian food images, outperforming ResNet-50. Recipe generation employs prompt engineering with Gemini to convert predicted dish classes into structured cooking steps. The front-end interface is developed using React Vite, enhanced with Tailwind CSS and DaisyUI, providing a responsive and visually appealing user experience that reduces search time by 40% compared to traditional keyword-based systems.

This work advances culinary AI by establishing benchmarks for ethnic cuisine analysis, introducing a hybrid architecture that combines vision transformers with large language models. Future extensions could enable dietary customization and video-based cooking assistance.

Country : India

¹ Harshita Sonkar² Laxmi Pawar³ Akanksha Puri⁴ Rahul Gupta⁵ Prof. Sonali Deshpande

Student, Smt. Indira Gandhi College of Engineering, Ghansoli, Navi Mumbai, Maharashtra, India
Student, Smt. Indira Gandhi College of Engineering, Ghansoli, Navi Mumbai, Maharashtra, India
Student, Smt. Indira Gandhi College of Engineering, Ghansoli, Navi Mumbai, Maharashtra, India
Student, Smt. Indira Gandhi College of Engineering, Ghansoli, Navi Mumbai, Maharashtra, India
Professor, Head of Dept. of AIML, Smt. Indira Gandhi College of Engineering, Ghansoli, Navi Mumbai, Maharashtra, India

IRJIET, Volume 9, Issue 4, April 2025 pp. 61-74

doi.org/10.47001/IRJIET/2025.904009

Full Paper
Download

References

Hassannejad, Hamid &Matrella, Guido &Ciampolini, Paolo & De Munari, Ilaria & Mordonini, Monica &Cagnoni, Stefano. (2016). Food Image Recognition Using Very Deep Convolutional Networks. 41-49. 10.1145/2986035.2986042.
Bossard, L., Guillaumin, M., & Van Gool, L. (2014). Food-101 – Mining discriminative components with random forests.
Kawano, Y., & Yanai, K. (2014). Food image recognition with deep convolutional features.
Termritthikun, C., Kanjaruek, S., Khongkraphan, K., Muneesawang, P., & Lao-Sirieix, S. H. (2018).
Tan, M., & Le, Q. (2019). EfficientNet: Rethinking model scaling for convolutional neural networks.
Pandey, P., Deepthi, A., Mandal, B., &Puhan, N. B. (2020). FoodNet-2: Detection and recognition of food objects in day-to-day meals. IEEE Transactions on Image Processing, 29, 9013-9026.
Salvador, A., Hynes, N., Aytar, Y., Marin, J., Ofli, F., Weber, I., & Torralba, A. (2017). Learning cross-modal embeddings for cooking recipes and food images.
Chen, J., Ngo, C., & Chua, T. S. (2021). Cross-modal recipe retrieval with rich food attributes.
Marin, J., Biswas, A., Ofli, F., Hynes, N., Salvador, A., Aytar, Y., Weber, I., & Torralba, A. (2021). Recipe1M+: A dataset for learning cross-modal embeddings for cooking recipes and food images.
Pouladzadeh, P., Shirmohammadi, S., & Yassine, A. (2017). Using graph cut segmentation for food calorie measurement.
Min, W., Jiang, S., Liu, L., Rui, Y., & Jain, R. (2019). A survey on food computing.
Beijbom, O., Joshi, N., Morris, D., Saponas, S., & Khullar, S. (2015). Menu-match: Restaurant-specific food logging from images.
Horiguchi, S., Amano, S., Ogawa, M., & Aizawa, K. (2018). Personalized classifier for food image recognition.
Jiang, X., Wang, Y., Yang, Q., & Hoi, S. C. (2023). Domain-specialized models with general-purpose APIs: A study on specialized visual recognition integrated with large language models.
Kornblith, S., Shlens, J., & Le, Q. V. (2019). Do better ImageNet models transfer better?
Lin, M., Chen, Q., & Yan, S. (2014). Network in network.
Kingma, D. P., & Ba, J. (2015). Adam: A method for stochastic optimization.
Yosinski, J., Clune, J., Bengio, Y., & Lipson, H. (2014). How transferable are features in deep neural networks?
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., & Zhou, D. (2022). Chain of thought prompting elicits reasoning in large language models.
Majumder, B. P., Li, S., Ni, J., & McAuley, J. (2019). Generating personalized recipes from historical user preferences.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks.
Baltrusaitis, T., Ahuja, C., & Morency, L. P. (2019). Multimodal machine learning: A survey and taxonomy.

International Research Journal of Innovations in Engineering and Technology - IRJIET International Open Access, Monthly, Peer Reviewed, Reputed Journal ISSN (online): 2581-3048

For Authors

Publication Archives

Volume 1 - 2017

Volume 2 - 2018

Volume 3 - 2019

Volume 4 - 2020

Volume 5 - 2021

Volume 6 - 2022

Volume 7 - 2023

Volume 8 - 2024

Volume 9 - 2025

Volume 10 - 2026

For Board Members

Downloads

Research Areas

Recipe Decoder

Abstract

References

International Research Journal of Innovations in Engineering
and Technology - IRJIET

Editorial Policies

Quick Links