Lingdu Kong

Free Access | Review Article | 12 June 2024 | Cited: 3

Bridging Modalities: A Survey of Cross-Modal Image-Text Retrieval

Tieying Li

Chinese Journal of Information Fusion | Volume 1, Issue 1: 79-92, 2024 | DOI: 10.62762/CJIF.2024.361895

Abstract

The rapid advancement of Internet technology, driven by social media and e-commerce platforms, has facilitated the generation and sharing of multimodal data, leading to increased interest in efficient cross-modal retrieval systems. Cross-modal image-text retrieval, encompassing tasks such as image query text (IqT) retrieval and text query image (TqI) retrieval, plays a crucial role in semantic searches across modalities. This paper presents a comprehensive survey of cross-modal image-text retrieval, addressing the limitations of previous studies that focused on single perspectives such as subspace learning or deep learning models. We categorize existing models into single-tower, dual-tower,... More >

Graphical Abstract

Bridging Modalities: A Survey of Cross-Modal Image-Text Retrieval

We use cookies