I'm trying to duplicate Kirk Goldsberry's plots to visualize NBA shot quality. You can see an example here: ;name=small
I've gone through and cleaned the data. You can see an example of the dataframe here:
PLAYER_NAME | BASIC_ZONE | LOC_X | LOC_Y | Relative Percentage |
---|---|---|---|---|
DeMar DeRozan | Mid-Range | 71.0 | 151.0 | 0.021217 |
DeMar DeRozan | Mid-Range | -16.0 | 199.0 | 0.021217 |
DeMar DeRozan | Mid-Range | -25.0 | 65.0 | 0.021217 |
DeMar DeRozan | Mid-Range | -1.0 | 181.0 | 0.021217 |
I'm trying to duplicate Kirk Goldsberry's plots to visualize NBA shot quality. You can see an example here: https://pbs.twimg.com/media/DzSnbKQWsAAtFga?format=jpg&name=small
I've gone through and cleaned the data. You can see an example of the dataframe here:
PLAYER_NAME | BASIC_ZONE | LOC_X | LOC_Y | Relative Percentage |
---|---|---|---|---|
DeMar DeRozan | Mid-Range | 71.0 | 151.0 | 0.021217 |
DeMar DeRozan | Mid-Range | -16.0 | 199.0 | 0.021217 |
DeMar DeRozan | Mid-Range | -25.0 | 65.0 | 0.021217 |
DeMar DeRozan | Mid-Range | -1.0 | 181.0 | 0.021217 |
The code I'm using to generate the hexbins is:
'''
#Plot Hex using MATPLOTLIB
fig, ax = plt.subplots()
hexbin = ax.hexbin(x=player_sorted.LOC_X, y=player_sorted.LOC_Y, C=player_sorted['Relative Percentage'], gridsize=25, bins=[-0.0875, -.0625,-0.0375,-0.0125, 0.0125, 0.0375, 0.0625, 0.0875],
mincnt=5)#, vmin=-0.1, vmax=0.1)
cb = fig.colorbar(hexbin, ax=ax)
cb.set_label('Relative Percentage to League Average')
title_text = player + ' ' + str(year)
plt.title(title_text)
draw_court(ax)
# Adjust the axis limits and orientation of the plot in order
# to plot half court, with the hoop by the top of the plot
ax.set_xlim(-250,250)
ax.set_ylim(422.5, -47.5)
# Get rid of axis labels and tick marks
ax.set_xlabel('')
ax.set_ylabel('')
ax.tick_params(labelbottom='off', labelleft='off')
plt.show()
'''
The visual looks reasonable, but the bin values are messed up, I was expecting them to match the values in Relative Percentage column, since the default is to take the mean value of all the points in the hex. I checked to see if it was somehow doing the sum, but the numbers don't add up. I'm expecting values between -/+0.1 and am getting whole numbers.
Removing the bins argument from the hexbin generator seems to fix the plot scale. I don't understand why, maybe because some data was outside of the bin limits?
hexbin = ax.hexbin(x=player_sorted.LOC_X, y=player_sorted.LOC_Y, C=player_sorted['Relative Percentage'], gridsize=50,
mincnt=2, edgecolors='white', cmap='RdBu_r', vmin=-0.1, vmax=0.1)