The primary purpose of this workshop is to hold a challenge on Visual Question Answering on the VQA dataset. VQA is a new dataset containing open-ended and multiple-choice questions about images. These questions require an understanding of vision, language, and commonsense knowledge to answer. This workshop will provide an opportunity to benchmark algorithms on the VQA dataset and to identify state-of-the-art algorithms.
A secondary goal of this workshop is to bring together researchers interested in Visual Question Answering to share state-of-the-art approaches, best practices, and perspectives on future directions in multi-modal AI. We invite submissions of extended abstracts of at most 2 pages describing work in areas such as: Visual Question Answering, (Textual) Question Answering, Commonsense Knowledge, Video Question Answering, Image/Video Captioning and other problems at the intersection of vision and language. Accepted abstracts will be presented as posters at the workshop. The workshop will be held on June 26th, 2016 at the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
06月26日
2016
会议日期
注册截止日期
留言