Vineet Gandhi
F24, CVIT,
KCIS Research Block,
IIIT Hyderabad, Gachibowli
Hyderabad, India, 500032
I am currently an associate professor at IIIT Hyderabad, where I am affiliated with Center for Visual Information Techonology ( CVIT ). I also advise a beautiful animation startup on their AI and ML related efforts Animaker.com. I completed my PhD degree at INRIA Rhone Alpes/Univesity of Grenoble in applied mathematics and computer science (mathématique appliquée et informatique) under the guidance of Remi Ronfard. I was funded by the CIBLE scholarship by Region Rhone Alpes. Prior to this, I completed my Masters with Erasmus Mundus scholarship under CIMET consortium. I am extremely thankful to European Union for giving me this opportunity which had a huge impact on both my professional and personal life. I spent a semester each in Spain, Norway and France and later joined INRIA for my master thesis and continued for my PhD there. I was also lucky to travel and deeply explore Europe (from south of Spain to North of Norway), at times purely surviving on gestural expressions for communication. I obtained my Bachelor of Technology degree from Indian Institute of Information Technology, Design and Manufacturing (IIITDM) Jabalpur, India (I belong to the first batch of the insititute).
I like to focus on problems with tangible goals and I try to build end to end solutions (with neat engineering). My current research interests are in applied machine learning for applications in computer vision and multimedia. In recent years, I have been exploring specific problems of model generalization in ML, text-to-speech, speech from varying signals for accessibility, vision language models, automated cinematography/editing etc. In personal space, I like to spend time with my family, play cards, and read ancient literature.
News [ archives ]
| Mar 2026 | Two papers accepted at CVPR 2026 (one in main track and one in findings). Congratulations to Aishwarya for the amazing work. CCI presents striking visualization results for CLIP, while Lite-embed enables adapting CLIP to rare classes with just a few images, without modifying the model. Wonderful collaboration continues with Aishwarya and Srikrishna. |
|---|---|
| Mar 2026 | Our paper called CLARIS: Clear and Intelligible Speech from Whispered and Dysarthric Voices accepted at CHI 2026. This breakthrough work shows that disarthric speech can be real-time converted into normal speech. Congratulations to Neil, Yash, and Shirish, with special mention to Yash on his first PhD paper. Speech samples available here:. |
| Sep 2025 | Our paper on Simplifying Knowledge Transfer in Pretrained Models accepted at TMLR. Congratulations to Siddharth. |
| Jul 2025 | Gave a talk titled The Sound Dimension: Speech and Audio in Multimodal AI at the CVIT workshop 2025. Slides (PDF): here |
| Jun 2025 | Our paper titled “Pseudo-labelling meets Label Smoothing for Noisy Partial Label Learning” recieved best paper award at FGVC workshop at CVPR 2025. Congratulations to Darshana, the hard work has paid off. |