Study : Ligue 1



The world of football is increasingly linked to betting market and not any forecasting model based on the Dixon-Coles (1997) has been openly applied to the results of the French Ligue 1.

The problem : How to develop a forecasting tool, based on a recognized scientific model, which takes into account historical data of the Ligue 1 ?



Data Mining :

Starting with data crawling and data mining techniques, we have collected data for all Ligue 1 matches from season 1993-1994 until today.


Data Industrialisation :

For each game, we can get a lot of data: the final score, the times of the goals have been scored, the name of the goal scorers, number of corners, number of offsides, etc.


Ranking data :

In this first version of analysis, we have considered for each game :

  • season of the match
  • date of the match
  • name of the home team / away team
  • number of goals scored by the home team / away team


The Dixon Coles-model (1997) that we have chosen :

  1. It allows you to model the different skills in attack and in defense for each team.
  2. It considers a sort of dependency between goals scored by the home team and goals scored by the away team for "low-scoring" games, basically games with few goals (i.e. 0-0, 1-0, 0-1, 1-1).
  3. It takes into account the change in performance over time for each team. This remark comes from the idea that a performance of a team is probably more similar to their performance in the latest games rather than their performance in older games.


Ligue 1 has a predictive model based on the Dixon-Coles (1997). Anyone can refer to and check real-time forecasts of the next matchday. A system of virtual machine and machine learning is set up to automatically retrieve the new results of the league and to automatically remodel the algorithm in order to provide statistics for the following matchday.


A playful interface in data visualization, with the clubs' crest logos and interactive graphics, has been set up to check the future predictions. In addition, the working methodology is free and usable by any data scientist wishing to improve the model. Let's rock!