VQA & Visual Dialog



Introduction Publications Exhibition Resources Paper-List Link

Introduction

  This part covers the researches related to visual dialog and video question answering. Visual dialog is a multi-round extension for visual question answering (VQA). The interactions between the image and multi-round question answer pairs are progressively changing, and the relationships among the objects in the image are influenced by the current question. Video question answering task aims to make the model capable of answering the question reffering to a video, which requires both appearance and motion information and is still difficult to establish the complex semantic connections between textual and various visual information. The key to the above two tasks is how to effectively realize relation reasoning. The current research is mainly based on graph neural network and memory network.

Publications  

Pairwise VLAD Interaction Network for Video Question Answering
Hui Wang, Dan Guo, Xiansheng Hua, and Meng Wang
ACM International Conference on Multimedia (ACM MM), 2021
[Paper] [BibTex]
Context-Aware Graph Inference with Knowledge Distillation for Visual Dialog
Dan Guo, Hui Wang, and Meng Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI), 2021
[Paper] [BibTex]
Iterative Context-Aware Graph Inference for Visual Dialog
Dan Guo, Hui Wang, Hanwang Zhang, Zhengjun Zha, and Meng Wang
Conference on Computer Vision and Pattern Recognition (CVPR), 2020
[Paper] [BibTex]
Textual-Visual Reference-Aware Attention Network for Visual Dialog
Dan Guo, Hui Wang, Shuhui Wang, and Meng Wang
IEEE Transactions on Image Processing (TIP), 2020
[Paper] [BibTex]
Dual Visual Attention Network for Visual Dialog
Dan Guo, Hui Wang, and Meng Wang
International Joint Conference on Artificial Intelligence (IJCAI), 2019
[Paper] [BibTex]

Exhibition

Resources

Waiting for Update...

Paper-List

  1. Hui Wang, Dan Guo, Xiansheng Hua, and Meng Wang, "Pairwise VLAD Interaction Network for Video Question Answering", ACM International Conference on Multimedia (ACM MM), 2021.[Link]
  2. Dan Guo, Hui Wang, and Meng Wang, "Context-Aware Graph Inference with Knowledge Distillation for Visual Dialog", IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2021.[Link]
  3. Dan Guo, Hui Wang, Hanwang Zhang, Zhengjun Zha, and Meng Wang, "Iterative Context-Aware Graph Inference for Visual Dialog", Conference on Computer Vision and Pattern Recognition (CVPR), 2020.[Link]
  4. Dan Guo, Hui Wang, Shuhui Wang, and Meng Wang, "Textual-Visual Reference-Aware Attention Network for Visual Dialog", IEEE Transactions on Image Processing (TIP), 2020.[Link]
  5. Dan Guo, Hui Wang, and Meng Wang, "Dual Visual Attention Network for Visual Dialog", International Joint Conference on Artificial Intelligence (IJCAI), 2019.[Link]

[Back to Homepage]


© VUT-HFUT 2021       Last updated on Sep. 15, 2021