Load some fun and essential libraries.
suppressMessages(library(dplyr))
suppressMessages(library(tidyr))
suppressMessages(library(readxl))
suppressMessages(library(tinytex))
suppressMessages(library(plotly))
suppressMessages(library(lubridate))
suppressMessages(library(htmltools))
suppressMessages(library(DT))
We took revenue data from the fabulous work released each year by Deloitte with their Deloitte Money Football League. We painstakingly pored over each year’s leaderboard and compiled into Excel, which is presented here as CSV. We have no reason to doubt anything produced by Deloitte, however, we did do a couple of crosschecks against some filings with Companies House. We are content to rely on Deloitte’s work and thank them for it.
Although we are dealing with a review of Liverpool and Manchester City, all teams have been recorded in our csv file.
Clicking on the link below will take you to the Deloitte Football Money League home and contains some excellent and informative reads over the years they have compiled this data.
Deloitte Football Money League
Let’s grab the data we prepared. The read.csv() function will create an R data frame for us with the contents of the csv file we specify. A data frame is an R object similar in structure to a traditional table.
R uses the ‘<-’ assignment operator. There are others but we will not complicate the conversation. Best practice typically recommends <- for assignment.
Football.Revenues <- read.csv(file="footiedata/FootballRevenues.csv", header=TRUE, sep=",")
## This creates an object (in this case a data.frame) named 'Football.Revenues' from the csv file.
head(Football.Revenues, n=10)
## Season Rank Identifier Club Matchday Broadcast Commercial Total
## 1 2017/2018 1 RM Real Madrid 127.1 222.6 315.5 665.2
## 2 2017/2018 2 FCB FC Barcelona 128.3 197.5 285.8 611.6
## 3 2017/2018 3 MUFC Manchester United 105.9 204.1 280.0 590.0
## 4 2017/2018 4 BM Bayern Munich 92.0 156.5 308.9 557.4
## 5 2017/2018 5 MCFC Manchester City 56.6 211.5 235.4 503.5
## 6 2017/2018 6 PSG Paris Saint-Germain 89.2 113.2 277.5 479.9
## 7 2017/2018 7 LFC Liverpool FC 81.2 222.6 151.3 455.1
## 8 2017/2018 8 CFC Chelsea 73.9 204.2 169.9 448.0
## 9 2017/2018 9 ARS Arsenal 98.9 183.3 106.9 389.1
## 10 2017/2018 10 THFC Tottenham Hotspur 75.5 200.7 103.2 379.4
## We call the new object 'Football.Revenues to obtain a visualization and display only top 10.
## Calling str() 'structure' of an object displays information about the object and contents.
str(Football.Revenues)
## 'data.frame': 220 obs. of 8 variables:
## $ Season : chr "2017/2018" "2017/2018" "2017/2018" "2017/2018" ...
## $ Rank : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Identifier: chr "RM" "FCB" "MUFC" "BM" ...
## $ Club : chr "Real Madrid" "FC Barcelona" "Manchester United" "Bayern Munich" ...
## $ Matchday : num 127.1 128.3 105.9 92 56.6 ...
## $ Broadcast : num 223 198 204 156 212 ...
## $ Commercial: num 316 286 280 309 235 ...
## $ Total : num 665 612 590 557 504 ...
## class() will show the type of object.
class(Football.Revenues)
## [1] "data.frame"
We will also be using custom palette for plots leveraging club colors. We are generating this global variable “footie_palette” to reference in all our plots.
footie_palette <- c(rgb(200,16,46, maxColorValue = 255), rgb(108,171,221, maxColorValue = 255))
footie_palette <- setNames(footie_palette, c("LFC", "MCFC"))
LabelTags <- c("Liverpool FC", "Manchester City")
LabelTags <- setNames(LabelTags, c("LFC", "MCFC"))
We are going to visualize total revenues for Liverpool and Manchester City to get a feel for context and orders of magnitude. The goal is to filter the data.frame Football.Revenues to work with only LFC and MCFC data. The filter() function is a component of library(dplyr).
LFCvMCFC.Revenues <- # assignment to an object.
filter(Football.Revenues, Identifier %in%
c("LFC","MCFC"))
# we like to use %in% but there are other methods.
LFCvMCFC.Revenues
## Season Rank Identifier Club Matchday Broadcast Commercial Total
## 1 2017/2018 5 MCFC Manchester City 56.6 211.5 235.4 503.5
## 2 2017/2018 7 LFC Liverpool FC 81.2 222.6 151.3 455.1
## 3 2016/2017 5 MCFC Manchester City 51.9 203.5 198.1 453.5
## 4 2016/2017 9 LFC Liverpool FC 68.8 156.8 138.9 364.5
## 5 2015/2016 5 MCFC Manchester City 52.5 161.4 178.7 392.6
## 6 2015/2016 9 LFC Liverpool FC 56.8 125.7 119.5 302.0
## 7 2014/2015 6 MCFC Manchester City 43.4 135.4 173.8 352.6
## 8 2014/2015 9 LFC Liverpool FC 57.1 124.6 116.4 298.1
## 9 2013/2014 6 MCFC Manchester City 47.5 133.2 165.8 346.5
## 10 2013/2014 9 LFC Liverpool FC 51.0 101.0 103.8 255.8
## 11 2012/2013 6 MCFC Manchester City 39.6 88.4 143.0 271.0
## 12 2012/2013 12 LFC Liverpool FC 44.6 63.9 97.7 206.2
## 13 2011/2012 7 MCFC Manchester City 30.8 88.2 112.1 231.1
## 14 2011/2012 9 LFC Liverpool FC 45.2 63.3 80.2 188.7
## 15 2010/2011 9 LFC Liverpool FC 40.9 65.3 77.4 183.6
## 16 2010/2011 12 MCFC Manchester City 26.6 68.8 57.8 153.2
## 17 2009/2010 8 LFC Liverpool FC 42.9 79.5 62.1 184.5
## 18 2009/2010 11 MCFC Manchester City 24.4 54.0 46.7 125.1
## 19 2008/2009 7 LFC Liverpool FC 42.5 74.6 67.7 184.8
## 20 2008/2009 19 MCFC Manchester City 20.8 48.2 18.0 87.0
## 21 2007/2008 7 LFC Liverpool FC 39.2 76.3 51.5 167.0
## 22 2007/2008 20 MCFC Manchester City 18.5 43.3 20.5 82.3
We can see that the list is now filtered the way we like. Let’s summarize and plot a visual of total revenues over the period in question.
Revenue.Totals <-
LFCvMCFC.Revenues %>% # we will discuss the "pipe" %>% below.
group_by(Club) %>%
summarize(Total = sum(Total))
Revenue.Totals
## # A tibble: 2 × 2
## Club Total
## <chr> <dbl>
## 1 Liverpool FC 2790.
## 2 Manchester City 2998.
A decent amount of money, then.
Revenue.Pie <-
plot_ly(
data = Revenue.Totals,
labels = ~Club,
values = ~Total,
type = 'pie',
marker = list(colors = footie_palette)) %>% #note piping '%>%' the plot to layout.
layout(title = 'Total Revenue LFC v MCFC - 2007/2008 to 2017/2018 (£ Millions)')
Revenue.Pie
What’s £200 Million between friends? So, from the 2007/2008 season up to the 2017/2018 season, Manchester City officially recorded a smidgeon under £3 Billion in revenues while Liverpool recroded £2.79 Billion. One would not think this to be an astronomical advantage. But let’s see how the numbers look over time.
Revenue.Plot <-
plot_ly(
data = LFCvMCFC.Revenues,
x = ~Season,
y = ~Total,
type = "scatter",
mode = "lines",
color = ~Identifier,
colors = footie_palette) %>%
layout(legend = list(x = 0.05, y = 0.95), #sometimes we move the legend around
title = "LFC & MCFC Annual Revenue",
yaxis = list(title = '(£ Millions)'))
Revenue.Plot
This is an interesting graphic. Liverpool booked more revenue than Manchester City from the 2007/2008 season until 2010/2011 season. From 2011/2012 onwards, Manchester City has steadily outperformed Liverpool. Although, for 2017/2018 it appears Liverpool have shrunk the gap somewhat. Whether this becomes a future trend is impossible to say but worth watching. In any event, it is clear that over 11 seasons, the overall revenue for these two teams had a steep incline. No doubt this is a large reflection of the tv deals signed by the Premier League over the past decade.
We do not expect to see a striking visualization but let’s check and see how each compares on the three major categories of revenue.
Stream.Summary <-
LFCvMCFC.Revenues %>%
group_by(Club)%>%
summarize(Matchday = sum(Matchday), Commercial = sum(Commercial), Broadcast = sum(Broadcast))
Stream.Summary
## # A tibble: 2 × 4
## Club Matchday Commercial Broadcast
## <chr> <dbl> <dbl> <dbl>
## 1 Liverpool FC 570. 1066. 1154.
## 2 Manchester City 413. 1350. 1236.
Matchday.Plot <-
plot_ly(
data = LFCvMCFC.Revenues,
x = ~Season,
y = ~Matchday,
type = 'scatter',
mode = 'lines',
color = ~Identifier,
colors = footie_palette,
showlegend = FALSE
) %>%
layout(annotations = list(x = 0.2 , y = .95, text = "Matchday", showarrow = F,
xref='paper', yref='paper'))
Broadcast.Plot <-
plot_ly(
data = LFCvMCFC.Revenues,
x = ~Season,
y = ~Broadcast,
type = 'scatter',
mode = 'lines',
color = ~Identifier,
colors = footie_palette,
showlegend = FALSE
) %>%
layout(annotations = list(x = 0.4 , y = .95, text = "Broadcast", showarrow = F,
xref='paper', yref='paper'))
Commercial.Plot <-
plot_ly(
data = LFCvMCFC.Revenues,
x = ~Season,
y = ~Commercial,
type = 'scatter',
mode = 'lines',
color = ~Identifier,
colors = footie_palette
) %>%
layout(annotations = list(x = 0.6 , y = .95, text = "Commercial", showarrow = F,
xref='paper', yref='paper'))
Stream.Plot <-
subplot(Matchday.Plot, Broadcast.Plot, Commercial.Plot, shareY = TRUE) %>%
layout(title = "LFC & MCFC Revenue Categories",
yaxis = list(title = '(£ Millions)')
)
Stream.Plot
That certainly tells a story.
One might suspect the club receives generous arrangements from sister companies within the family. However, we will have to see if this sort of data is available for review.
Grabbing the data. The read_xlsx() function allows us to auto-generate an R data.frame directly from an existing MS Excel file. There are similar functions, available through R and packages, for other file types such as read.csv(). read_xlsx() is a function found within the library(readxl).
LFCvMCFC <- read_xlsx("footiedata/LFCVMCFC.xlsx")
The data enclosed was produced using information on transferleague.co.uk, transfermarkt.com/.co.uk, wikipedia and club/fan sites to help clarify some questions.
In most cases, transfer amounts are as listed on transferleague.co.uk and player attributes primarily sourced on transfermarkt. Some notes:
A quick look at the summary data:
#display summary
summary(LFCvMCFC)
## Date Club Player_Purchased Fee Birthdate Age_at_Transfer
## Min. :2008-07-01 00:00:00.00 Length:164 Length:164 Min. : 0.00 Min. :1974-04-12 00:00:00.000 Min. :15.28
## 1st Qu.:2010-07-01 00:00:00.00 Class :character Class :character 1st Qu.: 1.65 1st Qu.:1984-06-04 00:00:00.000 1st Qu.:21.17
## Median :2012-08-31 00:00:00.00 Mode :character Mode :character Median : 8.25 Median :1989-10-02 00:00:00.000 Median :24.29
## Mean :2013-02-25 14:38:02.93 Mean :13.98 Mean :1988-10-13 19:01:27.804 Mean :24.39
## 3rd Qu.:2015-07-29 06:00:00.00 3rd Qu.:22.00 3rd Qu.:1993-03-10 18:00:00.000 3rd Qu.:27.01
## Max. :2019-07-09 00:00:00.00 Max. :75.00 Max. :2000-07-23 00:00:00.000 Max. :35.33
## From League Position International
## Length:164 Length:164 Length:164 Length:164
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
Quick points of interest are median transfer fee at £8.25 Million, median age at time of transfer 24.29 years.
We intend to plot against calendar years primarily and add a column “Year” using lubridate year() function:
#add column to df extracting year from "Date"
LFCvMCFC$Year <-
year(LFCvMCFC$Date)
#display structure
str(LFCvMCFC)
## tibble [164 × 11] (S3: tbl_df/tbl/data.frame)
## $ Date : POSIXct[1:164], format: "2018-07-01" "2018-07-01" "2018-07-19" "2018-07-13" ...
## $ Club : chr [1:164] "LFC" "LFC" "LFC" "LFC" ...
## $ Player_Purchased: chr [1:164] "Fabinho" "Naby Keïta" "Alisson" "Xherdan Shaqiri" ...
## $ Fee : num [1:164] 39 52.8 56.2 13 3 ...
## $ Birthdate : POSIXct[1:164], format: "1993-10-23" "1995-02-10" "1992-10-02" "1991-10-10" ...
## $ Age_at_Transfer : num [1:164] 24.7 23.4 25.8 26.8 19.8 ...
## $ From : chr [1:164] "Monaco" "RB Leipzig" "AS Roma" "Stoke City" ...
## $ League : chr [1:164] "Ligue 1" "Bundesliga" "Serie A" "Premier League" ...
## $ Position : chr [1:164] "Defensive Midfiled" "Centre Midfield" "Goalkeeper" "Right Wing" ...
## $ International : chr [1:164] "Brazil" "Guinea" "Brazil" "Switzerland" ...
## $ Year : num [1:164] 2018 2018 2018 2018 2017 ...
How much money did both teams spend on transfers from mid 2008 to 2019?
SpendSummary <-
LFCvMCFC %>%
group_by(Club) %>%
summarise(TotalFees = sum(Fee))
SpendSummary
## # A tibble: 2 × 2
## Club TotalFees
## <chr> <dbl>
## 1 LFC 901.
## 2 MCFC 1391.
Yeah, quite a lot.
TotalPie <-
plot_ly(
data = SpendSummary,
labels = ~Club,
values = ~TotalFees,
type = 'pie',
marker = list(colors = footie_palette)) %>% #note piping '%>%' the plot to layout.
layout(title = 'Total Transfer Spend LFC v MCFC - 2008 to 2019 (£ Millions)')
TotalPie
Of all the fees that LFC and MCFC spent from 2008 to 2019, Liverpool accounted for 39% and MCFC accounted for 61%. Or put another way, MCFC spent half as much again as LFC…
(1391.05-900.95)/900.95 * 100
## [1] 54.39814
Intuitively, fans will probably have expected amounts in this order of magnitude.
Let’s see how that spending played out year over year…
AnnualSpending <-
LFCvMCFC %>%
group_by(Year, Club) %>%
summarise(AnnualSpend = sum(Fee))
## `summarise()` has grouped output by 'Year'. You can override using the `.groups` argument.
AnnualSpending
## # A tibble: 23 × 3
## # Groups: Year [12]
## Year Club AnnualSpend
## <dbl> <chr> <dbl>
## 1 2008 LFC 39
## 2 2008 MCFC 77.7
## 3 2009 LFC 36
## 4 2009 MCFC 168
## 5 2010 LFC 22.4
## 6 2010 MCFC 135.
## 7 2011 LFC 114.
## 8 2011 MCFC 100
## 9 2012 LFC 28.9
## 10 2012 MCFC 57
## # ℹ 13 more rows
AnnualPlot <-
plot_ly(
data = AnnualSpending,
x = ~Year,
y = ~AnnualSpend,
type = "scatter",
mode = "line",
color = ~Club,
colors = footie_palette) %>%
layout(legend = list(x = 0.05, y = 0.95), #sometimes we move the legend around
title = "LFC & MCFC Annual Transfer Spending",
yaxis = list(title = 'Spent (£ Millions)'))
AnnualPlot
It would appear Manchester City took full advantage of new investment in 2008, 2009 and 2010. This pattern also resumed in 2015, 2016 and 2017 where significant investment was made. Liverpool spent more on transfers than Manchester City during only three years; 2011, 2014 and 2018.
Let’s see what patterns might emerge from a look at the individual transactions.
TransactionScatter <-
plot_ly(
data = LFCvMCFC,
x = ~Year,
y = ~Fee,
type = "scatter",
mode = 'markers',
color = ~Club,
colors = footie_palette,
size = 10, #we have opted for a standard size here
marker = list(opacity = 0.75), #we have a lot of density in parts so opacity is increased
text = ~paste(Player_Purchased, Fee, Club), #text and hoverinfo introduced to improve default hovertext
hoverinfo = "text") %>%
layout(legend = list(x = 0.06, y = 1.0),
title = "LFC & MCFC Individual Transfers",
yaxis = list(title = 'Cost (£ Millions)'))
TransactionScatter
Interestingly, we can see LFC has the highest amount paid for a single player (Virgil van Dijk in 2018) but in all other years, MCFC purchased the most expensive player; in many years, several more expensive players than LFC.
Let’s have a look at distribution of purchase transactions:
PriceViola <- LFCvMCFC %>%
plot_ly(
x = ~Club,
y = ~Fee,
split = ~Club,
type = 'violin',
colors = footie_palette,
color = ~Club,
box = list(visible = T),
meanline = list(visible = T)) %>%
layout(title = "LFC & MCFC Fee Distribution of Transfer Targets",
xaxis = list(title = "Club"),
yaxis = list(title = "Fee (£ Millions)",
zeroline = F)
)
PriceViola
We like this visual. You can see that Liverpool’s distribution of player purchase values skews lower than Manchester City. Liverpool median = £7 Million and Manchester City £12 Million. The LFC mean = £11.5 Million and MCFC mean = £16.1 Million. Liverpool KDE of 1 collides at £3.27 Million vs £4.27 Million for Manchester City. LFC 3rd quartile is £17 Million and Manchester City at £25.5 Milion. If one assumes a correlation of quality to cost, this visualization would suggest a deeper distribution of quality in the Manchester City side.
We were curious how these massive financial resources had be allocated to positions on the pitch, also…
PositionSummary <-
LFCvMCFC %>%
group_by(Position, Club) %>%
summarise(Total= sum(Fee))
## `summarise()` has grouped output by 'Position'. You can override using the `.groups` argument.
PositionSummary
## # A tibble: 24 × 3
## # Groups: Position [12]
## Position Club Total
## <chr> <chr> <dbl>
## 1 Attacking Midfield LFC 31.8
## 2 Attacking Midfield MCFC 102
## 3 Centre Back LFC 138.
## 4 Centre Back MCFC 261.
## 5 Centre Forward LFC 200.
## 6 Centre Forward MCFC 257.
## 7 Centre Midfield LFC 186.
## 8 Centre Midfield MCFC 88.8
## 9 Defensive Midfiled LFC 53.5
## 10 Defensive Midfiled MCFC 102.
## # ℹ 14 more rows
Visual will make this more meaningful…
PositionScatter <-
plot_ly(
data = PositionSummary,
x = ~Total,
y = ~Position,
type = "scatter",
mode = 'markers',
color = ~Club,
colors = footie_palette,
size = 10) %>%
layout(title = "LFC & MCFC Transfer Spend by Position",
xaxis = list(title = 'Total (£ Millions)'))
PositionScatter
We are not surprised that most financial resources have been allocated to the spine of the team in Centre Back, Centre Midfield and Centre Forward. This is true of both teams although MCFC have deployed significantly more. Interesting outlier at Right Wing where MCFC have deployed susbstantial resources.
What stories emerge when looking at teams that players were purchased “From”?
PartnerSummary <-
LFCvMCFC %>%
group_by(Club, From) %>%
summarise(Total = sum(Fee))
## `summarise()` has grouped output by 'Club'. You can override using the `.groups` argument.
PartnerSummary
## # A tibble: 110 × 3
## # Groups: Club [2]
## Club From Total
## <chr> <chr> <dbl>
## 1 LFC 1.FSV Mainz 05 4.7
## 2 LFC AC Milan 16
## 3 LFC AEK Athens 1.5
## 4 LFC AS Roma 117.
## 5 LFC Ajax 23
## 6 LFC Arsenal 35
## 7 LFC Aston Villa 52.5
## 8 LFC Atletico Madrid 0
## 9 LFC Bayer 04 Leverkusen 11
## 10 LFC Benfica 20
## # ℹ 100 more rows
PartnerScatter <-
plot_ly(
data = PartnerSummary,
x = ~Total,
y = ~From,
type = "scatter",
mode = 'markers',
color = ~Club,
colors = footie_palette,
marker = list(size = ~Total/4)) %>% #we have changed scaling here to drag the eye in desired direction
layout(title = "LFC & MCFC Transfer Spending by Selling Club",
xaxis = list(title = "Spending (£ Millions)")
)
PartnerScatter
Liverpool really have spent a lot of money at Southampton and AS Roma. However, this plot is super high on density and low on readability. Do we need to see all the dots around y axis? Probably not. Let’s do a top 10 leveraging top_n() function.
TopPartnerSummary <-
top_n(PartnerSummary, 10, Total)
TopPartnerSummary
## # A tibble: 20 × 3
## # Groups: Club [2]
## Club From Total
## <chr> <chr> <dbl>
## 1 LFC AS Roma 117.
## 2 LFC Ajax 23
## 3 LFC Arsenal 35
## 4 LFC Aston Villa 52.5
## 5 LFC Monaco 39
## 6 LFC Newcastle 66.3
## 7 LFC RB Leipzig 52.8
## 8 LFC Southampton 170.
## 9 LFC Sunderland 26
## 10 LFC TSG Hoffenheim 29
## 11 MCFC Arsenal 70
## 12 MCFC Atletic Bilbao 57
## 13 MCFC Benfica 51
## 14 MCFC Everton 84.5
## 15 MCFC FC Porto 52
## 16 MCFC Leicester 60
## 17 MCFC Monaco 95
## 18 MCFC Real Madrid 59
## 19 MCFC Valencia 57
## 20 MCFC VfL Wolfsburg 82
TopPartnerScatter <-
plot_ly(
data = TopPartnerSummary,
x = ~Total,
y = ~From,
type = "scatter",
mode = 'markers',
color = ~Club,
colors = footie_palette,
marker = list(size = ~Total/4)) %>%
layout(title = "LFC & MCFC Top 10 Transfer Partners",
xaxis = list(title = "Spending (£ Millions)")
)
TopPartnerScatter
LFC and MCFC clearly had their preferred partners during the period in question. MCFC more evenly distributed across Monaco, Everton, Wolfsburg and Arsenal. We wonder if this pattern holds true when looking at the League players have come from. top_n() will be used here again.
LeagueSummary <-
LFCvMCFC %>%
group_by(Club, League) %>%
summarise(Total = sum(Fee))
## `summarise()` has grouped output by 'Club'. You can override using the `.groups` argument.
TopLeagueSummary <-
top_n(LeagueSummary, 10, Total)
TopLeagueScatter <-
plot_ly(
data = TopLeagueSummary,
x = ~Total,
y = ~League,
type = "scatter",
mode = 'markers',
color = ~Club,
colors = footie_palette,
marker = list(size = ~Total/5)) %>%
layout(title = "LFC & MCFC Transfer Purchases by Top 10 League of Origin",
xaxis = list(title = "Spending (£ Millions)")
)
TopLeagueScatter
Clearly, both teams have a preference for players already in the Premier League. Manchester City have shown a significant appetite for high value players from Spain’s La Liga and to a lesser extent Germany’s Bundesliga. Liverpool have also spent a good deal on Italy’s Serie A, which is no surprise as we earlier saw the value of transactions with AS Roma. We are going to sidetrack for a moment as a question just occurred to us. Did the arrival of Pep Guardiola as manager of Manchester City influence spending on players from La Liga and Bundesliga? Guardiola managed in both these leagues prior to joining Manchester City.
#Filter dataset
PepSpending <-
LFCvMCFC %>%
filter(Club == "MCFC" & League %in% c("La Liga", "Bundesliga", "Premier League"))
#Summarize the data
PepSpendingSummary <-
PepSpending %>%
group_by(Year, League) %>%
summarise(Total = sum(Fee))
## `summarise()` has grouped output by 'Year'. You can override using the `.groups` argument.
PepSpendingSummary
## # A tibble: 26 × 3
## # Groups: Year [11]
## Year League Total
## <dbl> <chr> <dbl>
## 1 2008 Bundesliga 6.7
## 2 2008 La Liga 39
## 3 2008 Premier League 14
## 4 2009 Bundesliga 16
## 5 2009 La Liga 0
## 6 2009 Premier League 152
## 7 2010 Bundesliga 11
## 8 2010 La Liga 49
## 9 2010 Premier League 26
## 10 2011 Bundesliga 27
## # ℹ 16 more rows
And let’s visualize this with Premier League for reference:
PepSpendingPlot <-
plot_ly(
data = PepSpendingSummary,
x = ~Year,
y = ~Total,
type = "scatter",
mode = "line",
color = ~League) %>%
layout(title = "MCFC Spending on Bundesliga and La Liga",
yaxis = list(title = "Spending (£ Millions)")
)
PepSpendingPlot
Pep Guardiola arrived February 1, 2016. There is clearly an uptick for spending on players from La Liga but peaks also exist in 2010 and 2013. Pep’s knowledge of the league may have influenced some of the recent purchases.
Fans may be curious about distribution of purchases across country of origin. This is what we found. Again, we have refined based on a top 10.
NationSummary <-
LFCvMCFC %>%
group_by(Club, International) %>%
summarise(Total = sum(Fee))
## `summarise()` has grouped output by 'Club'. You can override using the `.groups` argument.
TopNationSummary <-
top_n(NationSummary, 10, Total)
TopNationScatter <-
plot_ly(
data = TopNationSummary,
x = ~Total, y = ~International,
type = "scatter",
mode = 'markers',
color = ~Club,
colors = footie_palette,
marker = list(size = ~Total/3)) %>%
layout(title = "LFC & MCFC Transfer Spending by Top 10 Country of Origin",
xaxis = list(title = "Spending (£ Millions)",
yaxis = list(title = "Country")
)
)
TopNationScatter
In terms of monetary value, both clubs have spent the most on English players. MCFC have also invested large sums in French and Brazilian players. The absolute numbers are certainly telling but how are they trending? We will examine the top 6 nationalities represented above.
#filter the data
LFCFliter <-
LFCvMCFC %>%
filter(Club == "LFC" & International %in% c("England", "Brazil", "France", "Argentina", "Netherlands", "Spain"))
#summarize
LFCNationTrendSummary <-
LFCFliter %>%
group_by(Year, International) %>%
summarise(Total = sum(Fee))
## `summarise()` has grouped output by 'Year'. You can override using the `.groups` argument.
LFCNationTrendPlot <-
plot_ly(
data = LFCNationTrendSummary,
x = ~Year, y = ~Total,
type = "scatter",
mode = 'line',
linetype = ~International
)
LFCNationTrendPlot
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
For Liverpool we can see the early impact of spending on English players which also shows a recent slow increase. Large jump in resources deployed on Brazilian and Dutch players. Spending on the Dutch is due to the record on Virgil van Diyk.
and MCFC?
#filter the data
MCFCFliter <-
LFCvMCFC %>%
filter(Club == "MCFC" & International %in% c("England", "Brazil", "France", "Argentina", "Netherlands", "Spain"))
#summarize
MCFCNationTrendSummary <-
MCFCFliter %>%
group_by(Year, International) %>%
summarise(Total = sum(Fee))
## `summarise()` has grouped output by 'Year'. You can override using the `.groups` argument.
MCFCNationTrendPlot <-
plot_ly(
data = MCFCNationTrendSummary,
x = ~Year, y = ~Total,
type = "scatter",
mode = 'line',
linetype = ~International
)
MCFCNationTrendPlot
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
## Adding lines to mode; otherwise linetype would have no effect.
We are curious about the ages of players being purchased. Are they younger or established? Is one team buying younger than the other and investing for a longer future? We will use some slightly more involved statistical investigation here, opting for a violin plot. Please google them if you need detailed understanding.
AgeViola <- LFCvMCFC %>%
plot_ly(
x = ~Club,
y = ~Age_at_Transfer,
split = ~Club,
type = 'violin',
colors = footie_palette,
color = ~Club,
box = list(visible = T),
meanline = list(visible = T)) %>%
layout(title = "LFC & MCFC Age Distribution of Transfer Targets",
xaxis = list(title = "Club"),
yaxis = list(title = "Age at Transfer",
zeroline = F)
)
AgeViola
It is down to the individual reader to make any inferences here in terms of club policy. However, LFC does demonstrate a mean age of transfer targets almost a full year younger than MCFC. This is also true of the q3 values and slightly less true of q1. The difference is reduced somewhat looking at median. KDE of 1 occurs for LFC at ~25 years of age and for MCFC at ~24.80. Again, readers can reach their own conclusions but we find this interesting and would be of even greater interest if combined with data regarding contract length, etc.
We feel like trying a subplot of Country and League of Origin…
TopLeagueScatterSub <-
plot_ly(
data = TopLeagueSummary,
x = ~Total,
y = ~League,
type = "scatter",
mode = 'markers',
color = ~Club,
colors = footie_palette,
marker = list(size = ~Total/5),
showlegend = FALSE) %>%
layout(xaxis = list(title = "Spending (£ Millions)")
)
TopNationScatterSub <-
plot_ly(
data = TopNationSummary,
x = ~Total, y = ~International,
type = "scatter",
mode = 'markers',
color = ~Club,
colors = footie_palette,
marker = list(size = ~Total/4)) %>%
layout(xaxis = list(title = "Spending (£ Millions)")
)
NationAndLeagueSub <-
subplot(TopLeagueScatterSub, TopNationScatterSub, shareX = TRUE, margin = 0.1) %>%
layout(title = "LFC & MCFC Transfer Spending by Nationality and League of Origin")
NationAndLeagueSub
Generating HTML datatables:
RevTable <- datatable(Football.Revenues, caption = "Deloitte Football Money League Rankings (all revenues in £ Millions)")
RevTable
PurchaseTable <- datatable(LFCvMCFC, caption = "Player Purchase Transactions (Fee £ Millions")
PurchaseTable