top of page
Search
Writer's pictureVignesh Jayanth

Weighted Average Ranking Model to Rank Players using Wyscout Data

Updated: Feb 25, 2021



Let’s assume we’re trying to replace Ryan Bertrand (Southampton FC) who wishes to move away from the south coast with another left-footed full-back from Europe. Imagine having a list of 70 odd Left-Back profiles from Wyscout based on the consensus of your scouting team and you’re tasked with ranking potential transfer targets in an empirical manner. What would you do? My approach is a simple weighted average ranking model to narrow the list down further. It goes without saying, if you’re seeking to improve your squad with a similar replacement, you’re best placed to choose a player from a final group of 4 recommendations than 70.


Before I get started, I’m making a few assumptions here. I would use a Data centred approach first and then follow my intuition to categorize variables. My final recommendation on a player would be based on preference to age and a defensive rating of the left-back being replaced.

Firstly, with over 40 odd variables, there is this curse of dimensionality (!). It’s important to remove variables that might over-estimate a certain player’s potential if the model considers all variables equally (if variables represent similar distributions and are correlated). If variables had a correlation of over 70% and a Variance Inflation Factor (VIF) >10, then those variables were eliminated.


Firstly, with over 40 odd variables, there is this curse of dimensionality (!). It’s important to remove variables that might over-estimate a certain player’s potential if the model considers all variables equally (if variables represent similar distributions and are correlated). If variables had a correlation of over 70% and a Variance Inflation Factor (VIF) >10, then those variables were eliminated.


Fig 1.1: The process involved two-dimensionality reduction techniques (Variance Inflation Factor> 10 and Correlation > 70%). The variables mentioned on the right were removed.


The mean average of variables was used as weights assigned to each variable. Following the mathematical approach mentioned here, the below image (Fig 1.2) shows the weights used and the standard error used to estimate final ratings.



As football itself is subjective in nature, intuition is key to decision-making. Hence, I used my intuition to categorize variables into the following groups:



Final Recommendations

We’ve now got four recommendations of a Left-Back based on the image shown below. These recommendations are the top four options based on their individual defensive ratings (100 being the highest). So, we’ve got player 30 who’s at his peak age and has a strong overall rating but has the lowest defensive rating. Player 46 is younger than Ryan Bertrand and has the potential to improve further until he hits his peak. He also gels well within a team set up. Player 47 is at the veteran age (passed peak) but has a good shooting and defensive rating. Player 25 is a young full-back who’s got the highest defensive rating. Out of these four recommendations, how do we recommend just one to our chief scout?


We can then look at Ryan’s profile stats to identify similar replacements.



So these are Ryan Bertrand’s stats from premierleague.com (my Ctrl C+Ctrl V isn’t that good)

A general overview of Bertrand’s stats exemplify that he’s doesn’t massively favour getting shots away and he’s more inclined to be defensive-minded. Hence, my final recommendation is Player 46 as preference was given to his age (room to improve further until peak age & settle into the back four), defensive and team rating. The final stats of player 46 is shown below:



How would you create a ranking system for players? Kindly reach out to me if you wish to give feedback on the approach or share your approach, always up for a chat!

17 views0 comments

Comentários


Post: Blog2 Post
bottom of page