After an exciting Euro 2024, it’s time to assess how Quintessa’s N-Estimates sports rating algorithm has performed.

Last month, Quintessa’s "N-Estimates" algorithm was busy predicting the results of matches at UEFA Euro 2024. Now the dust has settled, it’s time to analyse how the algorithm performed over four weeks of competition. Previously, each prediction was accompanied by a plot displaying all the possible scoreline probabilities with the most likely outcome embellished by a cross. These plots have been updated with a green scoreline representing the actual result. The final plots for all the matches can be found at the end of this news story.

When assessing the performance of the algorithm during the competition, we have considered three performance metrics:

- the percentage of correct outcome (win/draw/loss) predictions;
- the percentage of correct goal difference predictions; and
- the percentage of correct exact scoreline predictions.

Figure 1 compares the algorithm’s performance over the whole of Euro 2024 against a benchmark in each of these metrics. Throughout this analysis, the predictions are compared with the scores at full time (including extra time if applicable; matches that were decided by a penalty shoot-out are counted as draws). The benchmarks used are calculated based on random selection from the distribution of results that occurred at Euro 2024; this approach ensures the highest benchmarks that could be obtained for this tournament from random guessing.

The algorithm performed significantly better than chance. The correct outcome was predicted in 51% of matches, approximately 18% higher than would be expected by chance. The correct goal difference was predicted in 33% of matches (compared to the 21% benchmark) and the exact scoreline was correctly predicted in 16% of matches (compared with the 11% benchmark).

Following the Qatar World Cup in 2022, we compared the algorithm’s predictions with the expert judgement of BBC pundit Chris Sutton. Chris and the algorithm tied for the number of correct outcomes and scorelines predicted, with Chris beating the algorithm by one correct goal difference prediction. At Euro 2024, the algorithm again gave a similar performance to Chris’ predictions. This time, Chris just edged the outcome and exact scoreline categories, with 27 correct outcomes (against the algorithm’s 26) and 9 correct scorelines (against 8 for the algorithm). However, N-Estimates won the goal difference category by two matches (17 compared with Chris’ 15). Perhaps we’ll call this one an honourable draw!

There is a large degree of inherent variability in the outcomes of football matches. For each match, our predictions included calculated probabilities for various scorelines. We can use these to calculate the percentile within the distribution of predicted results that contains the actual match result, for a given match. Plotting the cumulative distribution of these percentiles allows an assessment of how well (or otherwise) the variability of the predictions matches that of the real-life results. Such a plot is shown in Figure 2.

If the distribution of predictions exactly matched the distribution of observed match results, the cumulative distribution would follow the black dashed line in the figure. The blue line shows the cumulative distribution of all matches at Euro 2024. It is close to the black line, but lies slightly below it, indicating that the algorithm marginally overestimated the variability in the results. In particular, the algorithm predicted the exact scoreline more often than expected, and there were comparatively few matches in which the observed result had been given a low probability by the algorithm, exactly as we would want from the algorithm!

*Quintessa is not affiliated in any way with FIFA, UEFA or the BBC. Its application of the N-Estimates algorithm to UEFA Euro 2024 is an independent and non-commercial endeavour.*