-

Sequential Recommenders and Multimodal Inputs: Mitigating Data Quality Issues in Industry-Scale Recommenders

Jan Malte Lichtenberg

Matteo Rufini

Albatross AI

2026

Poor data quality remains a key bottleneck for advancing recommender systems in industrial settings. In this talk, we argue that sequential transformer-based recommenders, particularly ID-based architectures such as SASRec [1], are more robust to several forms of data quality issues compared to traditional learning-to-rank approaches. However, they sufer acutely from item cold-start, which we treat as a missing-modality problem. We then discuss how multimodal content embeddings can address this challenge and present DenseRec [2], a simple but efective method to mitigate item cold-start by integrating dense content embeddings into sequential models.

eol>Data Quality Multimodal Recommenders Sequential Recommenders Item Cold Start Content Embeddings

embeddings with dense multimodal representations in a way that preserves the advantages of both. This connects directly to the workshop’s focus on data quality in multimodal recommendation [ 3 ]: item cold-start represents a severe form of missing-modality scenario where the behavioral modality is completely absent [ 7 ].

DenseRec: A Dual-Path Approach. We proposed DenseRec [ 2 ], a simple yet efective dual-path method for sequential recommendation. DenseRec (Figure 1) retains a primary ID-based path while adding a secondary “dense path” derived from pretrained content embeddings. A learnable projection layer aligns the dense representations with the ID embedding space. During training, both paths are stochastically activated with probability dense, a hyper parameter. At inference time, the dense path is used only for cold-start items, while known items rely exclusively on the ID path. This ensures that multimodal signals augment the model only where necessary, thereby reducing destructive interference with strong behavioral signals. The approach treats item cold-start as a missing-modality problem and applies a selective integration strategy to maintain recommendation quality across both warm and cold items.

Experimental Validation. We evaluated DenseRec on three Amazon Reviews 2023 datasets [ 8 ] using time-based train/validation/test splits with strict cold-start: unseen items appear only in the test set. For dense content embeddings, we used the all-MiniLM-L6-v2 model1 from the sentence-transformers library [ 9 ] to embed item content. We compared DenseRec against a strong ID-only SASRec baseline, applying hyperparameter optimization only to the baseline for fairness. Results, measured in HR@100, show that DenseRec significantly mitigates item cold-start, with performance varying smoothly as a function of dense. In particular, dense = 0.5 yielded a favorable balance between ID and dense signals, demonstrating that careful integration of multimodal content can address data quality issues arising from missing behavioral data. In ongoing work, we experiment with multi-modal embeddings to build upon the results presented in [ 2 ].

Conclusion. Sequential ID-based recommenders provide robustness against many DQ issues prevalent in traditional LTR approaches, but struggle with item cold-start. DenseRec introduces a lightweight, non-destructive mechanism to incorporate multimodal content embeddings, reducing cold-start while retaining the strengths of ID-based modeling. More broadly, our findings highlight that understanding DQ challenges, including missing modalities in the form of cold-start, can directly inform the design of multimodal sequential models. 1https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 Declaration on Generative AI The authors have not employed any Generative AI tools.

[1]

W.-C.

Kang , J. McAuley , Self-attentive sequential recommendation, in: 2018 IEEE international conference on data mining (ICDM) , IEEE, 2018 , pp. 197 - 206 .

[2]

J. M.

Lichtenberg , A. De Candia , M. Rufini , Denserec: Revisiting dense content embeddings for sequential transformer-based recommendation , arXiv preprint arXiv:2508.18442 ( 2025 ).

[3]

Pomo ,

Jannach ,

Kim ,

Malitesta , A. C. M. Mancino , J.

McAuley , A.

Melchiorre , S.

Nawaz , First international workshop on data quality-aware multimodal recommendation (daquamrec) , in: Proceedings of the Nineteenth ACM Conference on Recommender Systems , 2025 , pp. 1378 - 1382 .

[4]

Singh ,

Vu ,

Mehta ,

Keshavan ,

Sathiamoorthy ,

Zheng ,

Hong ,

Heldt ,

Wei ,

Tandon , et al., Better generalization with semantic ids: A case study in ranking for recommendations , in: Proceedings of the 18th ACM Conference on Recommender Systems , 2024 , pp. 1039 - 1044 .

[5]

Hou ,

He , J. McAuley , W. X. Zhao , Learning vector-quantized item representation for transferable sequential recommenders , in: Proceedings of the ACM Web Conference 2023 , 2023 , pp. 1162 - 1171 .

[6]

Zhang ,

Zhou ,

Zeng ,

Shen , Are id embeddings necessary? whitening pre-trained text embeddings for efective sequential recommendation , in: 2024 IEEE 40th International Conference on Data Engineering (ICDE) , IEEE, 2024 , pp. 530 - 543 .

[7]

Ganhör ,

Moscati ,

Hausberger ,

Nawaz ,

Schedl , A multimodal single-branch embedding network for recommendation in cold-start and missing modality scenarios , in: Proceedings of the 18th ACM Conference on Recommender Systems , 2024 , pp. 380 - 390 .

[8]

Hou ,

Li ,

He ,

Yan ,

Chen , J. McAuley , Bridging language and items for retrieval and recommendation , arXiv preprint arXiv:2403.03952 ( 2024 ).

[9]

Reimers , I. Gurevych , Sentence-bert: Sentence embeddings using siamese bert-networks , in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics , 2019 . URL: https://arxiv.org/abs/ 1908 .10084.