Object-centric Video Representation for Long-term Action Anticipation

Abstract

This work builds object-centric video representations for long-term action anticipation by extracting task-specific object features from vision-language pretrained models and retrieving relevant objects across temporal scales.

Publication
In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2024
Changcheng Fu
Changcheng Fu
Ph.D. Student in Computer Science at University of Southern California

Hi, I’m Changcheng, a Ph.D. student in Computer Science at the University of Southern California, advised by Prof. Ram Nevatia. My research focuses on computer vision, deep learning, machine learning, and multimodal large language models for visual understanding and long-term action anticipation.