How to Find the Best Replacement for VVD Using Data Science

A data driven approach to identify the best replacement for the injured Liverpool Center Back Virgil Van Dijk

TL;DR

In this article, we scout for the best alternative for the injured Reds defender Virgil Van Dijk.
The steps involved in the process is as follows:
Data Collection:
All relevant player stats are collected from fbref.com
KPI Generation:
Key Performance indicators are created for evaluation of players
Data Wrangling:
Data cleansing and feature engineering are performed
Finding the Players most Similar to VVD:
Players with KPIs similar to VVD can be found using proximity measures
Conclusion:
Commenting on the findings of the excercise

Introduction

Virgil van Dijk is arguably the most important player for the Reds. Since his arrival at Anfield in January 2018, for the record transfer fee, he has been an integral part of the team, leading from the back.

Eric The Fish from UK, CC BY 2.0 <https://creativecommons.org/licenses/by/2.0>, via Wikimedia Commons

Before his arrival, the Reds had been struggling with weak, leaky defense. The Reds were also going through a silverware drought, since the Miracle of Istanbul in 2005.

During his two full seasons with the Reds, he has led the team to the top of the world winning the UEFA Champions League and the Premier League. He reached Ballon d’Or podium in 2019. He is the only defender to achieve this feat since Fabio Cannavaro won in 2006. He was also the man of the match in the Champions League final. He is an integral part and the captain of the Dutch national team.

VVD has picked up a nasty injury, after a reckless challenge by the Toffees’ Goal keeper Pickford. He is expected to miss the rest of the current season. This has put the Reds in a very precarious situation having to reinforce their defense by signing a world class defender to fill the void left by VVD. In this article we will analyze player statistics to identify the best replacement for the Dutchman.

Data Collection

For performing any analysis, the first step is gathering data. Players’ statistics are sourced from fbref.com.
The replacement for VVD will have to act as the backbone of the Reds defense. He will have to face tough oppositions in the Premier League and Champions League from the day one.
In view of these, we are considering players with top flight experience. Players from the top 5 leagues-English Premier League, La Liga, Bundesliga, Serie A and League One are only considered.
Players from smaller leagues may be considered when scouting for a long term replacement when there is enough time for player development and stakes are low. We expect VVD to return to the squad completely fit by next season and hence a long term replacement is not necessary.
We will use stats from the last season, that is 2019/20 season.

Generation of Key Performance Indicators (Feature Engineering)

Once the data collection is completed, the next step is identifying the Key Performance Indicators(KPI) for evaluation of players.
Identifying KPIs is the most important factor in the process, if different KPIs are considered, we will get different result.

“Everybody is a genius. But if you judge a fish by its ability to climb a tree, it will live its whole life believing that it is stupid.”
Albert Einstein

Key performance indicators measure the most critical function of the player in the club’s overall strategy.
Generally for a defender, quality of finishing is not important, whereas, defensive actions, passing etc. are of higher priority.
KPI will depend on the club’s style of play and the expectation from the player.
KPIs are generated as percentages or per ninety minutes stats for the purpose of comparison.

The areas of the game where VVD has the greatest impact are considered for generating KPIs.
We feel that VVD creates the most impact and his most important duty in the Liverpool squad are in the following areas:

Defensive Action

VVD at the heart of Liverpool’s defense does an incredible job in keeping the opposition attack at bay by his tackles, interceptions, blocks and Clearances.
These actions can be quantified by the following KPIs:
* Tackles attempted per 90 minutes
* Tackles won percentage
* Percentage of dribblers dribbled past
* Block shots per 90 minutes
* Interceptions per 90 minutes
* Clearances per 90 minutes

Passing

VVD ranks among the top in total number of passes in the Premier League. VVD’s intelligent passes are integral part of the Reds’ build up play.
He has a great passing range and is not afraid to go long.
Many of Liverpool’s goals last season were initiated by VVD signature long passes, cutting across the opposition defensive lines over the top to find Salah or Mane.
The following KPIs used to measure the passing related actions:
* Total pass attempted per 90 minutes
* Pass Completion percentage
* Distance covered per 90 minutes
* Long pass attempted per 90 minutes
* Long pass accuracy

Miscellaneous Stats (fbref terminology)

Standing 6 ft. 4 in. tall, VVD is also a formidable aerial threat. He frequently engages in aerial duels and has an impressive win percentage of more than 80% in the last season.
He is also a pertinent goal threat from Liverpool’s corners and a one-man-wall against opposition set pieces.

The following KPIs measure the frequency and quality of aerial duels:
* Aerial duels per 90
* Aerial duels win percentage

Data Cleansing and Wrangling

Removing Null Values

The data set consists of stats of 2,732 players including players playing in roles other than defense too.
The stats for some of the players include Null values which need to be omitted.

Scaling Data

All the parameters are scaled scaled for minimizing bias.
For example, the KPIs- Pass attempted per 90 minutes, Aerial duels per 90 minutes and Distance Covered per 90 minutes are of different scales.
To avoid bias on account of difference in scale, we transform the data using MinMax Scalar where the stats are converted into values between 0 and 1.

Finding Players with Similar KPIs as VVD

Similarity between two players can be measured using proximity measures on the KPIs. Various proximity measures available are Euclidean Distance, Manhattan Dinstance, Cosine Distance etc.
We are using Euclidean Distance to find the players similar to VVD. The smaller the Euclidean distance between players, the more similar are the players.

Check out this article by Maarten Grootendorst to learn about different distance measures and their use cases:

The top 10 players who are most similar to VVD and their scaled score under each of the 13 KPIs are as below:

Image by Author

The best replacement for VVD considering the above mentioned KPIs is Chelsea’s Centre Back Kurt Zouma.
Experienced defenders like Hummels, Boateng, Javi Martinez, Marcelo, Pique and Feddal also seems to be good fit. Vast experience of these players can add strength to the current shaky defensive unit and nurture the youth players. Most of them are nearing the ends of their careers and already have replacements available at their clubs. This means that Liverpool can land these players at a bargain deal in the January transfer window. They can act as a placeholder for VVD until he makes it back to the squad. However, the physical fitness of these players is a cause of concern, considering the jam-packed schedule of the current season.

On the other hand, splashing some cash right now to acquire a younger player is also sensible option considering the following factors:
* The current fixture is jam-packed and demands a lot physically from the players.
* The available center back Gomez is also currently injured and Matip is injury prone.
* Liverpool youth team players lack top flight experience and time to fit into the first team.
* Reduce the Squad’s over-dependence on VVD, by providing rotation option.
* James ‘Swiss Army Knife’ Milner can step in to fill the gap, but even he cannot deliver peak performance for 90 minutes consistently.
* Catching them young is good in the long run, Liverpool can book profits later in the transfer market.

Players under the age of 25 who are the best fits are:

Image by Author

Leipzig defender Dayot Upamecano, who is rumoured to be Reds’ top transfer target is the best in the lot in tackling prowess.
Dan-Axel Zagadou, youngster from Klopp’s former team- Dortmund, is also a good replacement option. He is the best option as a traditional defender as he the best in Interceptions and Blocking.
Bayern’s Centre Back, Sule is head and shoulders above the rest in passing department. He, like VVD also has a great passing range and can find attacking players with very good accuracy. He is also strong in the air, with high aerial duel win percentage. However, he doesn’t engage in aerial duels as often, may be because of the difference in Bundesliga style of play.
Joachim Andersen, Jonathan Tah, Clément Lenglet, Fikayo Tomori and Brendan Chardonnet are few other players Liverpool should consider.

Fars News Agency, CC BY 4.0 <https://creativecommons.org/licenses/by/4.0>, via Wikimedia Commons

A word of Caution:

The KPIs we have used for the analysis are from our perception of what the important duties of VVD are, in the squad.
Others might have a different opinion about his role and may use different KPIs, which can provide a different set of possible replacement options.
All the KPIs are given equal weight in the analysis.
If KPIs are assigned different weights according to their importance, we can find players who are good at areas with more importance.

Conclusion

It is very rare to find other instances where a player has created so much impact on a Club, both on and off the field, as VVD did at Liverpool.
Finding a suitable replacement for VVD is a daunting task for Klopp and his scouting team.
In this article we suggest a few replacement options who have playing style similar to VVD who can be considered as possible replacement options.

You can check out the latest article on Predicting EPL Results using xG Statistics here.

We would like to express our sincere gratitude to Football and Analytics community for the valuable inputs.
Thank you TomDecroos, Andre Brener, Eric from Between The Posts, Mark Thompson and Ashwin Raman.

Between The Posts

Carpe Diem!