VQA & Visual Dialog

Introduction Publications Exhibition Resources Paper-List Link

Introduction

This part covers the researches related to visual dialog and video question answering. Visual dialog is a multi-round extension for visual question answering (VQA). The interactions between the image and multi-round question answer pairs are progressively changing, and the relationships among the objects in the image are influenced by the current question. Video question answering task aims to make the model capable of answering the question reffering to a video, which requires both appearance and motion information and is still difficult to establish the complex semantic connections between textual and various visual information. The key to the above two tasks is how to effectively realize relation reasoning. The current research is mainly based on graph neural network and memory network.

Publications


	Pairwise VLAD Interaction Network for Video Question Answering Hui Wang, Dan Guo, Xiansheng Hua, and Meng Wang ACM International Conference on Multimedia (ACM MM), 2021 [Paper] [BibTex]
	Context-Aware Graph Inference with Knowledge Distillation for Visual Dialog Dan Guo, Hui Wang, and Meng Wang IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI), 2021 [Paper] [BibTex]
	Iterative Context-Aware Graph Inference for Visual Dialog Dan Guo, Hui Wang, Hanwang Zhang, Zhengjun Zha, and Meng Wang Conference on Computer Vision and Pattern Recognition (CVPR), 2020 [Paper] [BibTex]
	Textual-Visual Reference-Aware Attention Network for Visual Dialog Dan Guo, Hui Wang, Shuhui Wang, and Meng Wang IEEE Transactions on Image Processing (TIP), 2020 [Paper] [BibTex]
	Dual Visual Attention Network for Visual Dialog Dan Guo, Hui Wang, and Meng Wang International Joint Conference on Artificial Intelligence (IJCAI), 2019 [Paper] [BibTex]

Exhibition

Resources

Waiting for Update...

Paper-List

Hui Wang, Dan Guo, Xiansheng Hua, and Meng Wang, "Pairwise VLAD Interaction Network for Video Question Answering", ACM International Conference on Multimedia (ACM MM), 2021.[Link]
Dan Guo, Hui Wang, and Meng Wang, "Context-Aware Graph Inference with Knowledge Distillation for Visual Dialog", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021.[Link]
Dan Guo, Hui Wang, Hanwang Zhang, Zhengjun Zha, and Meng Wang, "Iterative Context-Aware Graph Inference for Visual Dialog", Conference on Computer Vision and Pattern Recognition (CVPR), 2020.[Link]
Dan Guo, Hui Wang, Shuhui Wang, and Meng Wang, "Textual-Visual Reference-Aware Attention Network for Visual Dialog", IEEE Transactions on Image Processing (TIP), 2020.[Link]
Dan Guo, Hui Wang, and Meng Wang, "Dual Visual Attention Network for Visual Dialog", International Joint Conference on Artificial Intelligence (IJCAI), 2019.[Link]