Food drying is essential for food production, extending shelf life, and
reducing transportation costs. Accurate real-time forecasting of drying
readiness is crucial for minimizing energy consumption, improving productivity,
and ensuring product quality. However, this remains challenging due to the
dynamic nature of drying, limited data availability, and the lack of effective
predictive analytical methods. To address this gap, we propose an end-to-end
multi-modal data fusion framework that integrates in-situ video data with
process parameters for real-time food drying readiness forecasting. Our
approach leverages a new encoder-decoder architecture with modality-specific
encoders and a transformer-based decoder to effectively extract features while
preserving the unique structure of each modality. We apply our approach to
sugar cookie drying, where time-to-ready is predicted at each timestamp.
Experimental results demonstrate that our model achieves an average prediction
error of only 15 seconds, outperforming state-of-the-art data fusion methods by
65.69% and a video-only model by 11.30%. Additionally, our model balances
prediction accuracy, model size, and computational efficiency, making it
well-suited for heterogenous industrial datasets. The proposed model is
extensible to various other industrial modality fusion tasks for online
decision-making.
Este artículo explora los viajes en el tiempo y sus implicaciones.
Descargar PDF:
2504.15599v1