AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

Abstract

AntGPT formulates long-term action anticipation from videos around large language models, using observed actions to predict future actions and infer longer-term goals through language-model-based generation and planning.

Publication
In International Conference on Learning Representations (ICLR), 2024
Changcheng Fu
Changcheng Fu
Ph.D. Student in Computer Science at University of Southern California

Hi, I’m Changcheng, a Ph.D. student in Computer Science at the University of Southern California, advised by Prof. Ram Nevatia. My research focuses on computer vision, deep learning, machine learning, and multimodal large language models for visual understanding and long-term action anticipation.