Keynote Talk

Ivan Laptev: Weakly-supervised Learning from Images and Video

Recent progress in visual recognition goes hand-in-hand with the supervised learning and large-scale training data. While the amount of existing images and videos is huge, their detailed annotation is expensive and often prohibitive. To address this problem, in this talk we will focus on weakly-supervised learning methods using incomplete and noisy annotation for training. In the first part I will discuss recognition from still images and will describe our work on weakly-supervised convolutional neural networks. I will present a network that learns to recognize and localize objects as well as human actions without using location supervision at the training time. Somewhat surprisingly, our weakly-supervised method achieves state-of-the-art performance comparable to its strongly-supervised counterparts. The second part of the talk will focus on the learning of human actions from videos and corresponding textual descriptions in the form of movie scripts or narrations. I will describe our recent formulation of this problem in the form of a quadratic program with constraints and will show its successful applications to the joint learning of actions and actors from movies as well as to the learning of key steps from narrated instruction videos. We will also discuss future research directions.

[Download the presentation]

About the Speaker:

  Ivan Laptev is a research director at INRIA Paris, France. He received a Habilitation degree from École Normale Supérieure in 2013 and a PhD degree in Computer Science from the Royal Institute of Technology in 2004. Ivan's main research interests include visual recognition of human actions, objects and interactions. He has published over 50 papers at international conferences and journals of computer vision and machine learning. He serves as an associate editor of IJCV, TPAMI journals, he was an area chair for CVPR’10,’13,’15, ICCV’11, ECCV’12,’14 and ACCV’14, he has co-organized several tutorials, workshops and challenges at major computer vision conferences. He has also co-organized a series of INRIA summer schools on computer vision and machine learning (2010-2013). He received ERC Starting Grant in 2012.