Vid2Seq: A pretrained visual language model for describing multi-event videos
7 by og_kalu | 2 comments on Hacker News.
Home
»
Hacker News
» New top story on Hacker News: Vid2Seq: A pretrained visual language model for describing multi-event videos
Subscribe to:
Post Comments (Atom)
Post a Comment