📋

Curriculum Vitae

Download PDF

Work

2024 – now
Research Scientist
Kyutai Labs · Paris, France 🇫🇷

Open Science AI lab — working on multimodal foundation models (vision, speech and text).

2021 – 2024
Deep Learning Research Engineer
Qualcomm AI Research · Amsterdam, Netherlands 🇳🇱

Neural network efficiency — conditional compute, dynamic sparsity, mixture-of-experts.

Education

2015 – 2020
PhD — Machine Learning & Computer Vision
IST Austria · Klosterneuburg, Austria 🇦🇹

Supervised by Prof. Christoph Lampert — Domain adaptation, object detection, transfer learning.

2012 – 2015
Masters + Bachelors in Computer Science
ENS Rennes · Rennes, France 🇫🇷

Specialization in Machine Learning and Computer Science.

2010 – 2012
CPGE (MPSI-MP*)
Lycée Clémenceau · Reims, France 🇫🇷

Research Internships

2020
Google Brain · Zürich, Switzerland 🇨🇭

Knowledge distillation for large neural networks. arxiv:2106.05237 →

2017
Google Brain · London, UK 🇬🇧

Unsupervised image-to-image translation, GANs and domain adaptation. arxiv:1711.05139 →

2015
Inria Rennes · Rennes, France 🇫🇷

Unsupervised clustering for text and audio.

2014
IST Austria · Klosterneuburg, Austria 🇦🇹

Adapting pre-trained classifiers to unknown test distributions.

2013
Inria Rennes · Rennes, France 🇫🇷

Video retrieval using circular Fourier transforms.

Selected Publications

Vision-Speech Models: Teaching Speech Models to Converse About Images

Amélie Royer*, Moritz Böhle*, Gabriel de Marmiesse, Laurent Mazaré, Neil Zeghidour, Alexandre Défossez, Patrix Pérez

Conference on Computer Vision and Pattern Recognition (CVPR) 2026

CASA: Cross-Attention over Self-Attention for Efficient Vision-Language Fusion

Moritz Böhle*, Amélie Royer*, Juliette Marrie*, Edouard Grave, Patrick Pérez

arXiv preprint 2025

MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers

Jakob Drachmann Havtorn*, Amélie Royer*, Tijmen Blankevoort, Babak Ehteshami Bejnordi

ICCV Workshop on New Ideas in Vision Transformers (NViT) 2023

Scalarization for Multi-Task and Multi-Domain Learning at Scale

Amélie Royer, Tijmen Blankevoort, Babak Ehteshami Bejnordi

Conference on Neural Information Processing Systems (NeurIPS) 2023

Skills

Programming
PythonC++OCamlC
Deep Learning
PyTorchJAXTensorFlowKeras
Tools
gitSLURMmatplotlibstreamlitLaTeX
Languages
French 🇫🇷 (native) English 🇬🇧 (fluent) German 🇩🇪 (advanced)