Back to Projects
2020-2021Sports Analytics & Machine Learning

Olympique de Marseille Recruitment System

An ML-powered player recruitment system built on StatsBomb data and VAEP modeling. Transforming global match data into actionable transfer insights for OM's sporting director.

PythonMachine LearningVAEPSports AnalyticsStatsBombFootball
Olympique de Marseille Recruitment System overview

Overview

This project brought advanced data science and machine learning to professional football recruitment for Olympique de Marseille. By ingesting and analyzing global match data from StatsBomb, I built a comprehensive system that transformed raw event data into actionable player insights.

The system provided OM's sporting director and recruitment team with data-driven transfer lists, moving beyond traditional scouting methods to quantify player value through probabilistic modeling.

The Challenge

Traditional football recruitment relies heavily on subjective scouting reports and basic statistics that fail to capture the true value of player actions. Clubs needed a way to:

  • Quantify the impact of every on-field action beyond goals and assists
  • Compare players across different leagues and positions objectively
  • Identify undervalued talent in markets beyond major leagues
  • Make data-driven transfer decisions with statistical confidence

Technical Approach

Data Pipeline

Ingested comprehensive match event data from StatsBomb, covering thousands of matches worldwide. Each match decomposed into granular events: passes, shots, tackles, positioning, and more.

VAEP Modeling

Implemented Valuing Actions by Estimating Probabilities (VAEP) - a machine learning framework that assigns value to every player action based on its impact on scoring probability. Each action is evaluated for how it changes the likelihood of scoring or conceding in the next few seconds.

Player Evaluation

Aggregated VAEP scores across all actions for each player, creating comprehensive performance profiles. This enabled direct comparison across positions, leagues, and playing styles.

Custom Transfer Lists

Generated targeted player recommendations based on OM's specific needs: position requirements, budget constraints, age profiles, and tactical fit. The system surfaced undervalued players whose contributions traditional metrics would miss.

Understanding VAEP

VAEP (Valuing Actions by Estimating Probabilities) represents a paradigm shift in player evaluation. Instead of counting events, it measures impact:

  • Probabilistic scoring: Each action is evaluated by how it changes the probability of the team scoring or conceding in the near future
  • Context-aware: A pass in midfield is valued differently than the same pass in the attacking third
  • Comprehensive coverage: Every touch, movement, and defensive action contributes to a player's overall value
  • ML-powered: Machine learning models trained on thousands of matches learn what actions actually lead to goals

This approach revealed players who excel at "winning actions" - the subtle plays that increase scoring probability but don't show up in traditional statistics.

Results & Impact

The system provided Olympique de Marseille's recruitment team with:

  • Custom transfer shortlists ranked by VAEP value and filtered by budget, age, and position needs
  • Quantitative player comparisons that revealed undervalued talent in less-scouted leagues
  • Data-driven insights to support or challenge traditional scouting assessments
  • Evidence-based negotiation positions for transfer discussions

This project demonstrated how machine learning can augment traditional football scouting, providing recruitment teams with objective, data-driven insights to complement human expertise.

Technology Stack

Python
StatsBomb API
Pandas
Scikit-learn
XGBoost
VAEP Framework
Jupyter Notebooks
Matplotlib
NumPy

Deep Dive: Football Analytics Series

Learn More

Read the full three-part series on football analytics and VAEP modeling.

Read on Medium