TY  - JOUR
T1  - Efficient Compositional Translation Embedding for Visual Relationship Detection
AU  - Heo, Yu-Jung 
AU  - Kim, Eun-Sol 
AU  - Choi, Woo Suk 
AU  - On, Kyoung-Woon 
AU  - Zhang, Byoung-Tak 
JO  - Journal of KIISE, JOK
PY  - 2022
DA  - 2022/1/14
DO  - 10.5626/JOK.2022.49.7.544
KW  - scene graph generation
KW  - visual relationship detection
KW  - image caption retrieval
KW  - translation embedding
AB  - Scene graphs are widely used to express high-order visual relationships between objects present in an image. To generate the scene graph automatically, we propose an algorithm that detects visual relationships between objects and predicts the relationship as a predicate. Inspired by the well-known knowledge graph embedding method TransR, we present the CompTransR algorithm that i) defines latent relational subspaces considering the compositional perspective of visual relationships and ii) encodes predicate representations by applying transitive constraints between the object representations in each subspace. Our proposed model not only reduces computational complexity but also outperformed previous state-of-the-art performance in predicate detection tasks in three benchmark datasets: VRD, VG200, and VrR-VG. We also showed that a scene graph could be applied to the image-caption retrieval task, which is one of the high-level visual reasoning tasks, and the scene graph generated by our model increased retrieval performance.