gabrielcretin.fr
Gabriel Cretin

Gabriel Cretin

I build deep learning models for protein biology and the infrastructure to train them. From designing neural architectures to managing GPU clusters, I handle both the science and the systems.

research infrastructure

Deep Learning

Computational Biologist - Ph.D

Specialized in Protein Language Models and distributed GPU training using PyTorch with 40k+ A100/H100 GPU hours on national HPC (IDRIS/CNRS). Published research on protein structure analysis, prediction, and generative modeling.

Input pLM Embeddings
Training Model
3D Structure
Flexibility
Fold Recognition
PyTorch Lightning ESM-2 / Ankh Autoencoders Contrastive Learning
Explore research

SRE & MLOps

Linux System Administrator

Running 15+ production web services including GitLab, JupyterHub, and structural biology tools. Managing a GPU cluster with 20+ workstations, 380+ CPUs, and 1 PB+ backup infrastructure.

gabriel@hpc-master:~
~
Ansible Docker Swarm SLURM / HPC OpenLDAP
View infrastructure

Professional Profile

A rare hybrid profile combining 6 years of Linux infrastructure management with 4 years of deep learning research. I design novel AI architectures and deploy them on the infrastructure I build, training on national supercomputers to production APIs serving 18K+ weekly requests.

40k+
GPU hours (IDRIS/CNRS)
8
Peer-reviewed papers
700+
HPC cores managed
18K+
Weekly API requests

Scientific Track

Deep Learning Research

Focus on Protein Language Models (ESM-2, Ankh, ProtTrans) and generative architectures. Developed Adversarial Autoencoders for embedding compression and contrastive learning for improved fold recognition, published in top-tier journals.

  • 8 peer-reviewed publications
  • 40k+ GPU hours on IDRIS/CNRS
  • 4 production web tools (PYTHIA, SWORD2, ICARUS, PEGASUS)

Engineering Track

SRE & Infrastructure

Managing a complete Linux ecosystem: web servers, GPU clusters, HPC, and centralized authentication. Building reliable platforms for research teams with automated provisioning and monitoring.

  • 14 web servers, 12 databases (~18K views/week)
  • HPC cluster: 48 nodes, 708 cores
  • 1 PB+ cumulated storage infrastructure

AI & Protein Science

Ph.D Research (2021–2025)

Representation Learning & Generative Modeling

Designed Adversarial Autoencoder (AAE) architectures to compress high-dimensional pLM embeddings (ESM-2, Ankh) into fixed-size latent spaces. Implemented contrastive triplet learning to improve structural fold recognition, surpassing state-of-the-art structure-based methods. Explored de novo protein design through latent space interpolation.

Papers in preparation - currently writing manuscripts on representation learning, adversarial autoencoders, and contrastive learning.

Adversarial Autoencoders Triplet Loss ESM-2 / Ankh / ProtTrans Fold Recognition

Tech Stack

ML Engineering

  • PyTorch / Lightning Expert
  • Python / Bash Expert
  • HPC / SLURM Advanced
  • AlphaFold / Foldseek Advanced

SRE & Infrastructure

HPC & Compute

Cluster Management

5 GPU servers for deep learning & molecular dynamics. SGI HPC cluster: 48 nodes, 708 cores. 25 Linux workstations with 700+ CPUs.

SLURM Spack / Environment Modules

Services

Self-Hosted Stack

14 web servers, 2 APIs, 12 databases serving ~18K views/week. GitLab, Mattermost, JupyterHub, centralized auth.

GitLab CI/CD OpenLDAP JupyterHub FastAPI

Automation & Networking

Infrastructure as Code

1 PB+ backup infrastructure (3 dedicated servers). Ansible playbooks, Docker Swarm orchestration, Samba/NFS storage.

Ansible Docker Swarm Apache2 / NFS EfficientIP

Publications

Featured Publications

PEGASUS

Protein Science 2025

Co-first author

Protein flexibility is essential to its biological function. However, experimental methods for its assessment, such as X‐ray crystallography and nucle...

DOI

ATLAS

Nucleic Acids Research 2024

Contributing author

Dynamical behaviour is one of the most crucial protein characteristics. Despite the advances in the field of protein structure resolution and predicti...

DOI

SWORD2

Nucleic Acids Research 2022

First author

Understanding the functions and origins of proteins requires splitting these macromolecules into fragments that could be independent in terms of folding, activity, or evolution. For that purpose, stru

DOI

MEDUSA

Journal of Molecular Biology 2021

Contributing author

MEDUSA: Prediction of Protein Flexibility from Sequence

Papers in Preparation

Currently writing manuscripts on my Ph.D. research — expected submission in 2025:

  • Representation Learning with Adversarial Autoencoders — Compressing pLM embeddings into continuous latent spaces for protein generation
  • Contrastive Learning for Fold Recognition — Triplet-based training to improve structural similarity detection beyond structure-based methods

Complete Bibliography

10 peer-reviewed papers 4 first/co-first author Updated: 2026-01-28

2025

Ragousandirane Radjasandirane, Gabriel Cretin, Julien Diharce, Alexandre G. de Brevern, Jean-Christophe Gelly. PATHOS: Predicting Variant Pathogenicity by Combining Protein Language Models and Biological Features.

Yann Vander Meersche, Gabriel Duval, Gabriel Cretin, Aria Gheeraert, Jean‐Christophe Gelly, Tatiana Galochkina. PEGASUS: Prediction of MD‐derived protein flexibility from sequence. Protein Science, 2025.

DOI:10.1002/pro.70221|Co-first author

Aria Gheeraert, Thomas Bailly, Yani Ren, Ali Hamraoui, Julie Te, Yann Vander Meersche, Gabriel Cretin, Ravy Leon Foun Lin, Jean-Christophe Gelly, Serge Pérez, Frédéric Guyon, Tatiana Galochkina. DIONYSUS: a database of protein–carbohydrate interfaces. Nucleic Acids Research, 2025.

Charlotte Perin, Gabriel Cretin, Jean-Christophe Gelly. Hierarchical Analysis of Protein Structures: From Secondary Structures to Protein Units and Domains. Methods in Molecular Biology, 2025.

2024

Yann Vander Meersche, Gabriel Cretin, Aria Gheeraert, Jean-Christophe Gelly, Tatiana Galochkina. ATLAS: protein flexibility description from atomistic molecular dynamics simulations. Nucleic Acids Research, 2024.

2023

Gabriel Cretin, Charlotte Périn, Nicolas Zimmermann, Tatiana Galochkina, Jean-Christophe Gelly, Lenore Cowen. ICARUS: flexible protein structural alignment based on Protein Units. Bioinformatics, 2023.

DOI:10.1093/bioinformatics/btad459|First author

2022

Gabriel Cretin, Tatiana Galochkina, Yann Vander Meersche, Alexandre G de Brevern, Guillaume Postic, Jean-Christophe Gelly. SWORD2: hierarchical analysis of protein 3D structures. Nucleic Acids Research, 2022.

DOI:10.1093/nar/gkac370|PMC9252838|First author

Nora El Jahrani, Gabriel Cretin, Alexandre G. de Brevern. CALR-ETdb, the database of calreticulin variants diversity in essential thrombocythemia. Platelets, 2022.

2021

Gabriel Cretin, Tatiana Galochkina, Alexandre de Brevern, Jean-Christophe Gelly. PYTHIA: Deep Learning Approach for Local Protein Conformation Prediction. International Journal of Molecular Sciences, 2021.

Yann Vander Meersche, Gabriel Cretin, Alexandre G. de Brevern, Jean-Christophe Gelly, Tatiana Galochkina. MEDUSA: Prediction of Protein Flexibility from Sequence. Journal of Molecular Biology, 2021.

CV / Resume

Download the PDF and browse a short timeline snapshot.

Download CV Updated: Jan 2026
  1. PhD Thesis Defense

    Université Paris Cité - "Deep learning approaches for protein analysis, prediction, and generation."

  2. PhD Student

    DSIMB Lab - Protein Language Models embeddings compression into continuous latent space for generation,
    and protein structure analysis and prediction

  3. Lead Linux System Administrator

    Managing the full-stack infrastructure: 30+ GPU workstations, GPU cluster, 1 PB+ storage and containerized web services.

  4. MSc (Master) - Biology, Computer Science, Bioinformatics

    Université Paris Cité - with honors (rank 2/23).

  5. BSc (Licence 3)- Biology, Computer Science, Bioinformatics

    Université Paris Cité - rank 6/21.

  6. Two-year degree (D.U.T) - Bioengineering & Bioinformatics

    Université de Clermont-Ferrand (Campus Aurillac) - ranks 4/45 (1st year) and 5/34 (2nd year).

Contact

Best way to reach me: gabrielcretin@gmail.com

Paris, France

🇫🇷 French • 🇬🇧 English

© Gabriel Cretin