In the late years Deep Learning has been a great force of change on most computer vision tasks. In video analysis problems, however, such as action recognition and detection, motion analysis and tracking, shallow architectures remain surprisingly competitive. What is the reason for this conundrum? Larger datasets are part of the solution. The recently proposed Sports1M helped recently in the realistic training of large motion networks. Still, the breakthrough has not yet arrived.
Assuming that the recently proposed video datasets are large enough for training deep networks for video, another likely culprit for the standstill in video analysis is the capacity of the existing deep models. More specifically, the existing deep networks for video analysis might not be sophisticated enough to address the complexity of motion information. This makes sense, as videos introduce an exponential complexity as compared to static images. Unfortunately, state-of-the-art motion representation models are extensions of existing image representations rather than motion dedicated ones. Brave, new and motion-specific representations are likely to be needed for a breakthrough in video analysis.
The workshop focuses on motion representations related, but not limited, to the following topics:
Influence of motion in object recognition, object affordance, scene understanding
Object and optical flow
Motion prediction, causal reasoning and forecasting
Event and action recognition
Spatio-temporal action localization
Modeling human motion in videos and video streams
Motion segmentation and saliency
Tracking of objects in space and time
Unsupervised action, actom discovery using ego motion
Applications of motion understanding and video dynamics in sports, healthcare, autonomous driving, driver assistance and robotics
07月21日
2017
会议日期
初稿截稿日期
初稿录用通知日期
注册截止日期
留言