Vineet Gandhi

F24, CVIT,
KCIS Research Block,
IIIT Hyderabad, Gachibowli
Hyderabad, India, 500032
I am currently an associate professor at IIIT Hyderabad, where I am affiliated with Center for Visual Information Techonology ( CVIT ). I also advise a beautiful animation startup on their AI and ML related efforts Animaker.com. I completed my PhD degree at INRIA Rhone Alpes/Univesity of Grenoble in applied mathematics and computer science (mathématique appliquée et informatique) under the guidance of Remi Ronfard. I was funded by the CIBLE scholarship by Region Rhone Alpes. Prior to this, I completed my Masters with Erasmus Mundus scholarship under CIMET consortium. I am extremely thankful to European Union for giving me this opportunity which had a huge impact on both my professional and personal life. I spent a semester each in Spain, Norway and France and later joined INRIA for my master thesis and continued for my PhD there. I was also lucky to travel and deeply explore Europe (from south of Spain to North of Norway), at times purely surviving on gestural expressions for communication. I obtained my Bachelor of Technology degree from Indian Institute of Information Technology, Design and Manufacturing (IIITDM) Jabalpur, India (I belong to the first batch of the insititute).
I like to focus on problems with tangible goals and I try to build end to end solutions (with neat engineering). My current research interests are in applied machine learning for applications in computer vision and multimedia. In recent years, I have been exploring specific problems of computational videography/cinematography; image/video editing; multiple sensor fusion for 3D analytics; sports analytics; document analytics; visual detection,tracking and recognition. In personal space, I like to spend time with my family, play cards, go on bicycle rides, read ancient literature and explore mountains.
News [ archives ]
Jun 2025 | Our paper titled “Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning” recieved best paper award at FGVC workshop at CVPR 2025. Congratulations to Darshana, the hard work has paid off. |
---|---|
May 2025 | Congratulations to Kawshik on successfully defending his thesis and completing his Dual Degree. Though NLP isn’t my core area, I ended up working on it with him, it was challenging but rewarding. Big thanks to Makarand and Shubham, for playing a crucial role in his thesis work. Best wishes to Kawshik for his next chapter at Google DeepMind. |
May 2025 | Our paper “NAM-to-Speech Conversion with Multitask-Enhanced Autoregressive Models” has been accepted at Interspeech 2025! Speech samples can be seen here |
Feb 2025 | Two full papers accepted at CVPR 2025. The first paper called TIDE, improves model generalization by localizing class-specific concepts and supports test-time correction. The second paper called VELOCITI, benchmarks video-language models on compositional understanding via a strict video-language entailment task tailored to modern VLMs. Try it on HuggingFace. |
Jan 2025 | Can LLMs untangle who’s who in complex stories? Our NAACL 2025 paper, IdentifyMe, puts them to the test with a new coreference benchmark! |