Culinary Culture: A Global Exploration of Health and Diversity in Cuisine
May 6, 2026·,
,
Mubaswira Ibnat Zidney
Anik Kumar Sannyashi
Raihan Tanvir
Faisal Muhammad Shah
Abstract
Accurate classification of food by cuisine and dietary categories is pivotal for advancing personalized nutrition and intelligent recommendation systems, yet unimodal approaches often struggle with label inconsistencies and cultural diversity in recipes. This study presents a novel multimodal deep learning methodology that synergistically integrates textual ingredient semantics with visual food image features to jointly predict cuisine and diet, establishing it as a robust solution to existing limitations. We refine a dataset of 4,986 recipes by consolidating over 76 regional cuisine labels into 30 country-level classes, enhancing semantic coherence and class balance. Our proposed framework employs transformer-based encoders to distill contextual ingredient information and advanced visual encoders to extract image representations, which are fused via an optimized average projection and dropout mechanism to maximize predictive accuracy. Evaluated on the refined dataset, this multimodal approach achieves 81% accuracy for cuisine and 79% for diet, significantly surpassing text-only baselines (up to 40% cuisine accuracy) and image-only baselines (up to 25% cuisine accuracy) by 15−40%. Ablation studies underscore the efficacy of our fusion strategy in addressing noisy labels, positioning this methodology as a scalable foundation for applications in dietary assessment, smart kitchen systems, and food informatics.
Type
Publication
2025 28th International Conference on Computer and Information Technology (ICCIT)