Payroll and Winning Percentage in the MLB

In lecture, Professory Wyner discussed the relationship between a team’s payroll and its winning percentage. In particular, for each season, he computed the “relative payroll” of each team by taking its payroll and dividing it by the median of payrolls of all teams in that seaosn. We will replicate his analysis in the following problems using the dataset “mlb_relative_payrolls.csv”, which you may download here if you haven’t received it yet.

  1. Read the data in from the file and save it as a tbl called “relative_payroll”
> relative_payroll <- read_csv(file = "data/mlb_relative_payrolls.csv")
Parsed with column specification:
cols(
  Team = col_character(),
  GM = col_character(),
  Team_Payroll = col_integer(),
  Winning_Percentage = col_double(),
  Year = col_integer(),
  Relative_Payroll = col_double()
)
  1. Make a histogram of team winning percentages. Play around with different binwidths.

  2. Make a histogram of the relative payrolls.

  3. Make a scatterplot with relative payroll on the horizontal axis and winning percentage on the vertical axis.

  4. Without executing the code below, discuss with your group and see if you can figure out what it is doing.

> ggplot(data = relative_payroll) + geom_point(mapping = aes(x = Year, y = Team_Payroll))
  1. Execute the code above. What can you say about how team payrolls have evolved over time? Make a similar plot that visualizes how relative payrolls have evolved over time.