See also: Google Scholar, Journals & Conferences, Thesis, Patents,
Preprints
- Post-hoc Calibration of Neural Networks, Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan, Thomas Mensink, Cristian Sminchisescu, and Richard Hartley. [arxiv]
- HAMMR: HierArchical MultiModal React agents for generic VQA, Lluis Castrejon, Thomas Mensink, Howard Zhou, Vittorio Ferrari, André Araujo, and Jasper Uijlings. [arxiv]
Journals, conferences & workshops
(peer reviewed)
- How (not) to ensemble LVLMs for VQA, Lisa Alazraki, Lluis Castrejon, Mostafa Dehghani, Fantine Huot, Jasper Uijlings, and Thomas Mensin | in 4th I Can’t Believe It’s Not Better Workshop (co-located with NeurIPS 2023) [pmlr, arXiv]
- Infinite Class Mixup, Thomas Mensink & Pascal Mettes | in BMVC (poster) 2023 [arxiv]
- Encyclopedic VQA: Visual questions about detailed properties of fine-grained categories, Thomas Mensink, Jasper Uijlings, Lluis Castrejon, Arushi Goel, Felipe Cadar, Howard Zhou, Fei Sha, André Araujo, and Vittorio Ferrari | in ICCV (poster) 2023 [arxiv]
- Scaling Vision Transformers to 22 Billion Parameters, Mostafa Dehghani, Josip Djolonga, Basil Mustafa, Piotr Padlewski, Jonathan Heek, Justin Gilmer, Andreas Steiner, Mathilde Caron, Robert Geirhos, Ibrahim Alabdulmohsin, Rodolphe Jenatton, Lucas Beyer, Michael Tschannen, Anurag Arnab, Xiao Wang, Carlos Riquelme, Matthias Minderer, Joan Puigcerver, Utku Evci, Manoj Kumar, Sjoerd van Steenkiste, Gamaleldin F. Elsayed, Aravindh Mahendran, Fisher Yu, Avital Oliver, Fantine Huot, Jasmijn Bastings, Mark Patrick Collier, Alexey Gritsenko, Vighnesh Birodkar, Cristina Vasconcelos, Yi Tay, Thomas Mensink, Alexander Kolesnikov, Filip Pavetić, Dustin Tran, Thomas Kipf, Mario Lučić, Xiaohua Zhai, Daniel Keysers, Jeremiah Harmsen & Neil Houlsby | in ICML (oral) 2023 [arxiv]
- The Missing Link: Finding label relations across datasets, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari, in ECCV (poster) | 2022 [arxiv]
- How stable are transferability metrics?, Andrea Agostinelli, Michal Pándy, Jasper Uijlings, Thomas Mensink, Vittorio Ferrari, in ECCV (poster) | 2022 [arxix]
- Transferability Metrics for Selecting Source Model Ensembles, Andrea Agostinelli, Jasper Uijlings, Thomas Mensink, and Vittorio Ferrari, in CVPR (oral) | 2022 [arxiv]
- Transferability Estimation using Bhattacharyya Class Separability, Michal Pándy, Andrea Agostinelli, Jasper Uijlings, Vittorio Ferrari, and Thomas Mensink, in CVPR (poster) | 2022 [arxiv]
- Factors of Influence for Transfer Learning across Diverse Appearance Domains and Task Types, Thomas Mensink, Jasper Uijlings, Alina Kuznetsova, Michael Gygli, and Vittorio Ferrari, in Transactions on Pattern Analysis and Machine Intelligence | 2021. [pdf, ieee, arxiv]
- Automatic Generation of Dense Non-rigid Optical Flow, Hoang-An Le, Tushar Nimbhorkar, Thomas Mensink, Anil S. Baslamisli, Sezer Karaoglu, and Theo Gevers, in Computer Vision and Image Understanding | 2021. [pdf, science-direct, arxiv]
- Neural Feature Matching in Implicit 3D Representations, Yunlu Chen, Basura Fernando, Hakan Bilen, Thomas Mensink, and Efstratios Gavves, in International Conference on Machine Learning (ICML) | 2021. [pdf, pmlr]
- Calibration of Neural Networks using Splines, Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan, Thomas Mensink, Cristian Sminchisescu, and Richard Hartley, In International Conference on Learning Representations (ICLR) | 2021 [pdf], [openreview], [arxiv]
- Multi-Loss Weighting with Coefficient of Variations, Rick Groenendijk, Sezer Karaoglu, Theo Gevers, and Thomas Mensink, In Winter Conference on Applications of Computer Vision (WACV) | 2021. [pdf] [arxiv]
- EDEN: Multimodal Synthetic Dataset of Enclosed GarDEN Scenes, Hoang-An Le, Thomas Mensink, Partha Das, Sezer Karaoglu, and Theo Gevers, In Winter Conference on Applications of Computer Vision (WACV) | 2021. [pdf] [arxiv]
- Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection, Alex Bewley, Pei Sun, Thomas Mensink, Dragomir Anguelov, and Cristian Sminchisescu, In the Conference on Robot Learning (CoRL) | 2020. [pdf] [arxiv]
- Novel View Synthesis from Single Images via Point Cloud Transformation, Hoang-An Le, Thomas Mensink, Partha Das, and Theo Gevers, In British Machine Vision Conference (BMVC) | 2020. [pdf] [arxiv]
- PointMixup: Data Augmentation for Point Clouds, Yunlu Chen, Vincent Tao Hu, Efstratios Gavves, Thomas Mensink, Pascal Mettes, Pangwan Yang, and Cees Snoek, In European Conference on Computer Vision (ECCV) | 2020 (spotlight presentation – top 5%). [pdf] [arxiv]
- On the benefit of adversarial training for monocular depth estimation, Rick Groenendijk, Sezer Karaoglu, Theo Gevers, and Thomas Mensink, In Computer Vision and Image Understanding (CVIU) | 2020. [pdf] [arxiv]
- Interactive Exploration of Journalistic Video Footage through Multimodal Semantic Matching, Sarah Ibrahimi, Shuo Chen, Devanshu Arya, Arthur Camara, Yunlu Chen, Tanja Crijns, Maurits van der Goes, Thomas Mensink, Emiel van Miltenburg, Daan Odijk, William Thong, Jiaojiao Zhao, Pascal Mettes, in ACM Multimedia – Demo track | 2019 [pdf]. This work is the result of the ICT with Industry workshop with RTL Nieuws.
- 3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation, Yunlu Chen, Thomas Mensink and Efstratios Gavves, In International Conference on 3D Vision (3DV) | 2019. [pdf] [arxiv]
- IterGANs: Iterative GANs to learn and control 3D object transformation, Ysbrand Galama and Thomas Mensink, In Computer Vision and Image Understanding (CVIU) | 2019. [pdf] [doi] [code] [arxiv]
- Unsupervised Generation of Optical Flow Datasets from Videos in the Wild, HoangAn Le, Tushar Nimbhorkar, Thomas Mensink, Sezer Karaoglu, Anil Baslamisli and Theo Gevers, ArXiV preprint (1812.01946)| 2018. [pdf] [arxiv]
- Three for one and one for three: Flow, Segmentation, and Surface Normals, HoangAn Le, Anil Baslamisli, Thomas Mensink and Theo Gevers, In British Machine Vision Conference (BMVC) | 2018 (oral, acceptance rate 4.3%) [pdf] [arxiv]
- IterGANs: Iterative GANs for Rotating Visual Objects, Ysbrand Galama and Thomas Mensink, In International Conference on Learning Representations – Workshop (ICLRw) | 2018. [pdf] [poster]
- DeepNCM: Deep Nearest Class Mean Classifiers, Samantha Guerriero, Barbara Caputo and Thomas Mensink, In International Conference on Learning Representations – Workshop (ICLRw) | 2018. [pdf] [poster] [code]
- The New Modality: Emoji Challenges in Prediction, Anticipation, and Retrieval, Spencer Cappallo, Stacey Svetlichnaya, Pierre Garrigues, Thomas Mensink and Cees G. M. Snoek, In Transactions on Multi Media (TMM) | 2018. [pdf] [arxiv] [doi]
- SAVI: Spotting Audio-Visual Inconsistencies in Manipulated Video, Robert Bolles, Brian Burns, Martin Graciarena, Andreas Kathol, Aaron Lawson, Mitchell McLaren and Thomas Mensink, In CVPR Workshop on Media Forensics | 2017. [pdf]
- Music-Guided Video Summarization using Quadratic Assignments, Thomas Mensink, Thomas Jongstra, Pascal Mettes and Cees Snoek, In ACM International Conference on Multimedia Retrieval (ICMR) | 2017 (Spotlight). [pdf]
- Video2vec Embeddings Recognize Events when Examples are Scarce, Amir Habibian, Thomas Mensink and Cees Snoek, In Transactions on Pattern Analysis and Machine Intelligence (PAMI) | 2016 . [pdf] [arxiv] [doi]
- Video Stream Retrieval of Unseen Queries using Semantic Memory, Spencer Cappallo, Thomas Mensink and Cees Snoek, In British Machine Vision Conference (BMVC) (Oral, acceptance rate 10%) | 2016. [pdf]
- Online Open World Recognition, Rocco de Rosa, Thomas Mensink and Barbara Caputo, Technical report | ArXiV | 2016. [pdf] [arxiv]
- Pooling Objects for Recognizing Scenes without Examples, Svetlana Kordumova, Thomas Mensink and Cees Snoek, In ACM International Conference on Multimedia Retrieval (ICMR) (Best paper award) | 2016. [pdf]
- Objects2action: Classifying and localizing actions without any video example, Mihir Jain, Jan van Gemert, Thomas Mensink and Cees Snoek, In International Conference on Computer Vision (ICCV) | 2015. [pdf] [poster]
- Active Transfer Learning with Zero-Shot Priors: Reusing Past Datasets for Future Tasks, Efstratios Gavves, Thomas Mensink, Tatiana Tommasi, Cees Snoek and Tinne Tuytelaars, In International Conference on Computer Vision (ICCV) | 2015. [pdf]
- Image2Emoji: Zero-shot Emoji Prediction for Visual Media, Spencer Cappallo, Thomas Mensink and Cees Snoek, In ACM International Conference on Multimedia (ACMMM) | 2015. [pdf] [poster]
- Query-by-Emoji Video Search, Spencer Cappallo, Thomas Mensink and Cees Snoek, Demo at ACMMM | 2015. [pdf]
- Event Fisher Vectors: Robust Encoding Visual Diversity of Visual Streams, Markus Nagel, Thomas Mensink and Cees Snoek, In British Machine Vision Conference (BMVC) (Oral, acceptance rate 7%) | 2015. [pdf]
- Latent Factors of Visual Popularity Prediction, Spencer Cappallo, Thomas Mensink and Cees Snoek, In ACM International Conference on Multimedia Retrieval (ICMR) ( Oral) | 2015. [pdf] [presentation]
- Bag-of-Fragments: Selecting and encoding video fragments for event detection and recounting, Pascal Mettes, Jan van Gemert, Spencer Cappallo, Thomas Mensink and Cees Snoek, In ACM International Conference on Multimedia Retrieval (ICMR) | 2015. [pdf]
- Discovering Semantic Vocabularies for Cross-Media Retrieval, Amir Habibian, Thomas Mensink and Cees Snoek, In ACM International Conference on Multimedia Retrieval (ICMR) | Oral | 2015. [pdf]
- MediaMill at TRECVID 2014: Searching Concepts, Objects, Instances and Events in Video, Cees Snoek, Koen van de Sande, Daniel Fontijne, Spencer Cappallo, Jan van Gemert, Amir Habibian, Thomas Mensink, Pascal Mettes, Ran Tao, Dennis Koelma and Arnold Smeulders, Proceedings of the TRECVID Workshop (TRECVID) | 2014. [pdf]
- VideoStory: A New Multimedia Embedding for Few-Example Recognition and Translation of Events, Amir Habibian, Thomas Mensink and Cees Snoek, In ACM International Conference on Multimedia (ACMMM) | Best paper award | 2014. [pdf] [poster] [slides]
- Attributes Make Sense on Segmented Objects, Zhenyang Li, Efstratios Gavves, Thomas Mensink and Cees Snoek, In European Conference on Computer Vision (ECCV) | 2014. [pdf]
- Robustifying Descriptor Instability using Fisher Vectors, Ivo Everts, Jan van Gemert, Thomas Mensink and Theo Gevers, In Transactions on Image Processing (TIP) | 2014. [pdf]
- COSTA: Co-Occurrence Statistics for Zero-Shot Classification, Thomas Mensink, Efstratios Gavves and Cees Snoek, In Conference on Computer Vision and Pattern Recognition (CVPR) | 2014. [pdf] [poster]
- The Rijksmuseum Challenge: Museum-Centered Visual Recognition, Thomas Mensink and Jan van Gemert, In ACM International Conference on Multimedia Retrieval (ICMR) | 2014. [pdf] [poster] [code] [data]
- Composite Concept Discovery for Zero-Shot Video Event Detection, Amir Habibian, Thomas Mensink and Cees Snoek, In ACM International Conference on Multimedia Retrieval (ICMR) | Oral | 2014. [pdf]
- Distance-Based Image Classification: Generalizing to New Classes at Near Zero Cost, Thomas Mensink, Jakob Verbeek, Florent Perronnin and Gabriela Csurka, In Transactions on Pattern Analysis and Machine Intelligence (PAMI) | 2013. [pdf] [Tech Report] [Book Chapter]
- Image Classification with the Fisher Vector: Theory and Practice, Jorge Sánchez, Florent Perronnin, Thomas Mensink and Jakob Verbeek, In International Journal on Computer Vision (IJCV) | 2013. [pdf] [code] [Tech Report]
- Large Scale Metric Learning for Distance-Based Image Classification on Open Ended Data Sets, Thomas Mensink, Jakob Verbeek, Florent Perronnin and Csurka, Gabriela, Chapter in Advanced Topics in Computer Vision,G. M. Farinella, S. Battiato, R. Cipolla, eds. | 2013. [pdf]
- Tree-structured CRF Models for Interactive Image Labeling, Thomas Mensink, Jakob Verbeek and Gabriela Csurka, In Transactions on Pattern Analysis and Machine Intelligence (PAMI) | 2012. [pdf]
- Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost, Thomas Mensink, Jakob Verbeek, Florent Perronnin and Gabriela Csurka, In European Conference on Computer Vision (ECCV) | Oral, acceptance rate 2.8%. [pdf]
- Face recognition from caption-based supervision, Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek and Cordelia Schmid, In International Journal on Computer Vision (IJCV) | 2012. [pdf] [Tech Report]
- Learning structured prediction models for interactive image labeling, Thomas Mensink, Jakob Verbeek and Gabriela Csurka, In Conference on Computer Vision and Pattern Recognition (CVPR) | 2011. [pdf] [poster]
- Learning to Rank and Quadratic Assignment, Thomas Mensink, Jakob Verbeek and Tiberio Caetano, In NIPS Workshop on Discrete Optimization in Machine Learning | 2011. [pdf] [poster]
- Trans Media Relevance Feedback for Image Autoannotation, Thomas Mensink, Jakob Verbeek and Gabriela Csurka, In British Machine Vision Conference (BMVC) | 2010. [pdf] [poster] [Tech Report]
- EP for Efficient Stochastic Control with Obstacles, Thomas Mensink, Jakob Verbeek and Bert Kappen, In European Conference on Artificial Intelligence (ECAI) | Oral | 2010. [pdf]
- LEAR and XRCE’s participation to Visual Concept Detection Task – ImageCLEF 2010, Thomas Mensink, Gabriela Csurka, Florent Perronnin, Jorge Sánchez and Jakob Verbeek, Working Notes of the CLEF Workshop | 2010. [pdf]
- Apprentissage de distance pour l’annotation d’images par plus proches voisins, Matthieu Guillaumin, Jakob Verbeek, Cordelia Schmid and Thomas Mensink, In Reconnaissance des Formes et Intelligence Artificielle | 2010. [pdf]
- Improving the Fisher Kernel for Large-Scale Image Classification, Florent Perronnin, Jorge Sánchez and Thomas Mensink, In European Conference on Computer Vision (ECCV) | Koenderink test-of-time award 2020 | 2010. [pdf] [poster]
- Image Annotation with TagProp on the MIRFLICKR set, Jakob Verbeek, Matthieu Guillaumin, Thomas Mensink and Cordelia Schmid, In ACM Multimedia Information Retrieval | 2010. [pdf]
- INRIA-LEARs participation to ImageCLEF 2009, Matthijs Douze, Matthieu Guillaumin, Thomas Mensink, Cordelia Schmid and Jakob Verbeek, Working Notes of the CLEF Workshop | 2009. [pdf]
- TagProp: Discriminative Metric Learning in Nearest Neighbor Models for Image Auto-Annotation, Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek and Cordelia Schmid, In International Conference on Computer Vision (ICCV) | Oral, acceptance rate 3.8% | 2009. [pdf]
- Improving People Search Using Query Expansions: How Friends Help To Find People, Thomas Mensink and Jakob Verbeek, In European Conference on Computer Vision (ECCV) | Oral, acceptance rate 4.6% | 2008. [pdf] [Extended Abstract BNAIC 2008] [Face Finder Demo BNAIC 2008]
- Automatic face naming with caption-based supervision, Matthieu Guillaumin, Thomas Mensink, Jakob Verbeek and Cordelia Schmid, In Conference on Computer Vision and Pattern Recognition (CVPR) | 2008. [pdf]
- Multi-Observations Newscast EM for Distributed Appearance Based Tracking, Thomas Mensink, Wojciech Zajdel and Ben Kröse, In Benelux Conference on Artificial Intelligence (BNAIC) | 2007. [pdf]
- Distributed EM Learning for Appearance Based Multi-Camera Tracking, Thomas Mensink, Wojciech Zajdel and Ben Kröse, In International Conference on Distributed Smart Cameras (IDCDS) | 2007 | Oral. [pdf]
Context:
In computer vision the two main journals are the IEEE Transactions on Pattern Analysis and Machine Intelligence and the International Journal of Computer Vision with acceptance rates below 30%. The three main conferences of computer vision are the IEEE International Conference on Computer Vision (ICCV), the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) and the European Conference on Computer Vision (ECCV). These conferences are very selective — in general less than 25% of the submitted articles are accepted — and their proceedings play a role which is as important as international journals.
Thesis
- Learning Image Classification and Retrieval Models, Thomas Mensink, PhD thesis, Université de Grenoble (INRIA-Grenoble and Xerox Research Centre Europe) | AFRIF Thesis award | 2012. [pdf]
- Multi-Observations Newscast EM for Distributed Multi-Camera Tracking, Thomas Mensink, Master’s thesis | Universiteit van Amsterdam | 2007. [pdf]
Patents
- Semantic multisensory embeddings for video search by text, Amir Habibian, Thomas Mensink and Cees Snoek, QualComm Inc, US patent application, filing date Sept-2015 | 2015.
- Metric learning for nearest class mean classifiers, Thomas Mensink, Jakob Verbeek, Florent Perronnin and Gabriela Csurka, XEROX Corp., US patent, number US20140029839 | 2012.
- Learning structured prediction models for interactive image labeling, Thomas Mensink, Jakob Verbeek and Gabriela Csurka, XEROX Corp., US patent, number US20120269436 | 2011.
- Large scale image classification, Florent Perronnin, Jorge Sánchez and Thomas Mensink, XEROX Corp., US Patent, number US20120045134 | 2010.
- Retrieval systems and method employing probabilistic cross-media relevance feedback, Thomas Mensink, Jakob Verbeek and Gabriela Csurka, XEROX Corp., US Patent, number US20120054130 | 2010.