Ferran Espuña
PhD Student in Mathematics at UAM–ICMAT
Arithmetic Combinatorics · Higher-Order Fourier Analysis · Former AI Research Engineer
Madrid, Spain · ferranespuna@gmail.com · +34 600 24 69 87 · GitHub
About
I am a first-year PhD student in mathematics at Universidad Autónoma de Madrid and ICMAT, advised by Pablo Candela. My research lies in arithmetic combinatorics, with particular interest in higher-order Fourier analysis, additive structures, inverse theorems, and connections with harmonic analysis and ergodic theory.
Before starting my PhD, I worked as a research engineer at the Barcelona Supercomputing Center, where I contributed to large language models, multilingual pretraining corpora, and high-performance AI systems.
Research Interests
- Arithmetic combinatorics
- Higher-order Fourier analysis
- Additive combinatorics
- Ergodic-theoretic methods
- Harmonic analysis
- Extremal and probabilistic combinatorics
Education
PhD in Mathematics
Universidad Autónoma de Madrid (UAM) / ICMAT | 2026–Present
Advisor: Pablo Candela
Research focused on arithmetic combinatorics and higher-order Fourier methods, including Gowers norms, inverse theorems, nilspaces, and additive patterns.
Master’s Degree in Advanced Mathematics and Mathematical Engineering
Universitat Politècnica de Catalunya (UPC) | 2025
GPA: 9.45/10
Relevant coursework in algebra, number theory, combinatorics, graph theory, computational complexity, and cryptography.
Master’s Thesis: Finding Partite Hypergraphs Efficiently
Developed a deterministic polynomial-time algorithm for finding large complete balanced (k)-partite subgraphs in dense (k)-uniform hypergraphs, matching asymptotic extremal bounds.
Double Bachelor’s Degree in Mathematics and Computer Science
Universitat de Barcelona (UB) | 2023
GPA: 9.0/10
Extraordinary Bachelor’s Degree Award
Publications
Espuña, F. Finding Partite Hypergraphs Efficiently.
Information Processing Letters, 2026.
DOI · arXiv
Palomar, J. et al. A CURATEd CATalog: Rethinking the Extraction of Pretraining Corpora for Mid-Resourced Languages.
Proceedings of COLING-LREC 2024.
Paper
Research & Professional Experience
Barcelona Supercomputing Center — Research Engineer
2023–2026
Research engineer in the Language Technologies unit working on large-scale language models and AI infrastructure.
- Developed and automated CURATE, a large-scale text processing pipeline for HPC environments
- Contributed to CATalog, the largest Catalan pretraining corpus
- Contributed to the Salamandra multilingual LLM family trained on MareNostrum 5
- Researched evaluation methodologies for open-ended language generation
- Investigated state space models, sparse autoencoders, and multimodal architectures
Computer Vision Center — Research Intern
2022
Worked on the application of topological data analysis techniques to study neural network generalization.
ChipScope Research Group (UB) — Image Processing
2022
Contributed to a European chip-scale microscopy project involving computational imaging pipelines, wave backpropagation, and system interface design.
Technical Background
Mathematics
Arithmetic combinatorics, graph theory, discrete mathematics, additive methods, probabilistic combinatorics
Machine Learning & HPC
Large language models, multimodal systems, model evaluation, PyTorch, distributed training, Slurm, Docker, Linux
Programming
Python, C/C++, Java, Bash
Languages
Spanish (native), Catalan (native), English (C2)
Selected Project
Complex Fractal Shaders
Interactive GLSL fragment shader visualising fractals emerging from complex dynamical systems.
Additional Information
- Stanford University — Machine Learning Specialization
- IELTS Academic — 8.5/9