CILVR Seminar: Visual Learning in the Open World

Speaker: Mengye Ren

Location: On-Line
Videoconference link: https://nyu.zoom.us/j/97986595706

Date: Tuesday, April 5, 2022

Over the past decades, we have seen machine learning making great strides in understanding visual scenes. Yet, most of its success relies on training models on a massive amount of data offline in a closed world and evaluating them in a similar test environment. In this talk, I would like to envision an alternative paradigm that will allow machines to acquire visual knowledge through an online stream of data in an open world, which entails abilities such as learning visual representations and concepts efficiently with limited and non-iid data. These capabilities will be core to future applications of real-world agents such as robotics and assistive technologies. I will share three recent papers towards this goal, and these works form three levels in our open-world visual recognition pipeline: the concept level, the grouping level, and the representation level. First, on the concept level, I will introduce a new learning paradigm that rapidly learns new concepts in a continual stream with only a few labels. Second, on the grouping level, I will discuss how to learn both representations and concept classes online without any labeled data by grouping similar objects into clusters. Lastly, on the representation level, I will present a new algorithm that learns general visual representations from high resolution raw video. With these different levels combined, I am hopeful that future intelligent agents will be able to learn on-the-fly without manually collecting data beforehand.