Predicting the NBA Part #1


As any reader of this blog knows, I have an infatuation with John Hollinger and his Player Efficiency Rating (PER). Over the weekend, as I pondered how offense and defense balance in predicting NBA team regular season success (at a super awesome hip club of course) I decided to run some numbers and determine how well PER can predict team success. From my other readings I knew that Points Scored Per 100 Possessions minus Points Surrendered Per 100 Possessions was a really good indicator of record so we will use that to compare our results to (and later we will work exclusively with that).

My first thought was just to do something a bit random with the PERs and see what happened. I simply took the PERs from the top 12 minute playing members (minimum 30 games played) from each eastern conference team and added them all together. From that point I ranked them based on their total value (higher being better obviously) then compared it to where the teams actually finished in record.

Team Average Team PER PER Rank Actual Finish Difference









Atlanta 13.30833333 12 13 -1
Boston 13.20833333 15 15 0
Charlotte 13.95 7 11 -4
Chicago 13.64545455 10 3 7
Cleveland 14.11 5 2 3
Detroit 15.53333333 1 1 0
Indiana 14.05 6 9 -3
Miami 14.925 3 5 -2
Milwaukee 13.78333333 9 14 -5
New Jersey 13.275 13 6 7
New York 13.64166667 11 12 -1
Orlando 13.28333333 15 8 7
Philadelphia 13.86666667 4 10 -6
Toronto 14.78181818 4 4 0
Washington 15.48 2 7 -5

As the chart above shows, the results were not very promising. The average distance from where a team's PER was to where they were in reality was 3.4. Which basically means if a team finishes 4th in PER than are on average probably about somewhere from the 1st to 8th, which tells us nothing useful. This data is only now useful for proving that certain things don't work.

My next idea was to rank teams best on every single player who played even single minute for them during the year by multiplying each players PER by their minutes played and then summing all of a teams PER*minutes together. Whoever had the highest score would be ranked first.

Team Total PER*M/10000 Rank Actual Finish Difference










Atlanta 27.88 13 13 0
Boston 27.37 12 15 -3
Charlotte 28.16 15 11 4
Chicago 29.42 6 3 3
Cleveland 29.65 3 2 1
Detroit 32.2 1 1 0
Indiana 27.47 14 9 5
Miami 29.52 5 5 0
Milwaukee 29.2 7 14 -7
New Jersey 28.56 10 6 4
New York 28.41 11 12 -1
Orlando 28.61 9 8 1
Philadelphia 28.8 8 10 -2
Toronto 29.58 4 4 0
Washington 31.44 2 7 -5


The data came out slightly better than in our last attempt as one would expect because this is a more logical measurement, but they are still not great. With an average difference of 2.4 now we could still only say our measured 4 seed was likely somewhere between 3 and 7, which is not nearly good enough. I this measurement also makes a statement about defense I believe since so many of the strong offense teams (like Washington for instance) finished so much higher in measurement than in the real world. Defense must be very important to determine actual success.

If we simply take where each of the teams finished the past year in points given up per 100 possessions in the real world rankings and average that in with our PER rankings we get the following.

Team PER Rank Defensive Rank Average Actual Finish Difference












Atlanta 13 12 12.5 13 0.5
Boston 12 10 11 15 4
Charlotte 15 11 13 11 -2
Chicago 6 1 3.5 3 -0.5
Cleveland 3 2 2.5 2 -0.5
Detroit 1 3 2 1 -1
Indiana 14 6 10 9 -1
Miami 5 5 5 5 0
Milwaukee 7 29 18 14 -4
New Jersey 10 8 9 6 -3
New York 11 15 13 12 -1
Orlando 9 4 6.5 8 1.5
Philadelphia 8 9 8.5 10 1.5
Toronto 4 7 5.5 4 -1.5
Washington 2 14 8 7 -1


With defensive ability taken into account we have now reduced the average difference to only 1.5 ranks per team. So if a team finished fourth in the rankings on average they would probably be between 3 and 5, not a horrible prediction based on just using numbers for such a complicated game, but how does this result compare to simply using the points scored vs. points given up per 100 possessions discussed at the top?

Team Point per 100 Scored Points Per 100 Given Up Difference Rank Difference
Atlanta 103 108.2 -5.2 15 -2
Boston 102.9 107.2 -4.3 12 3
Charlotte 103.4 107.7 -4.3 13 -2
Chicago 105.2 99.4 5.8 1 2
Cleveland 105.5 101.3 4.2 3 -1
Detroit 109.2 103.9 5.3 2 -1
Indiana 102.8 105.6 -2.8 9 0
Miami 104.9 104.7 0.2 5.5 -0.5
Milwaukee 106.7 112 -5.3 15 -1
New Jersey 106.1 106.2 -0.1 7 -1
New York 105.5 108.9 -3.4 11 1
Orlando 104.6 104.4 0.2 5.5 2.5
Philadelphia 103.7 106.8 -3.1 10 0
Toronto 107.2 105.8 1.4 4 0
Washington 109.8 110.9 -1.1 8 -1

Using this ultra simple, quick method we reduced the average difference to only 1.2, a measurable better result than anything we have gotten before. A team ranked this way would almost always finish in the real world between 3 and 5 if our measurements places them at 4. It is pretty clean then that PER alone without taking into account the other side of the ball is not effect and simply averaging PER ranks and defensive ranks isn't good enough, but can we find a way to use PER and defense to at least match the other method? I decided to try two ways. First I would take the total PER*minutes/10000 and multiply it by 3.66 (doing this moves the average Total PER equal to the average total points given up per 100 possessions) and add it to total points given up per 100 possessions to weigh offense and defense equally. The resulting sums would be ranked and then compared. My other idea was to weigh offense slightly less than defense (3.5 multiplier) to see if defense being a bigger factor would help (I did not try offense weighted higher since we saw before weighing offense heavily did not help, a slight offensive favoring me work well though, I cannot say). The results were as follows for the 3.66 data:

Total PER*3.66 Points Given Per 100 Difference Rank Actual Rank Difference
102.0532806 108.2 -6.1467194 14 13 1
100.1696616 107.2 -7.0303384 15 15 0
103.0739814 107.7 -4.6260186 10 11 -1
107.6831292 99.4 8.2831292 2 3 -1
108.5346648 101.3 7.2346648 3 2 1
117.8342124 103.9 13.9342124 1 1 0
100.5389556 105.6 -5.0610444 12 9 3
108.0521304 104.7 3.3521304 5 5 0
106.8689256 112 -5.1310744 13 14 -1
104.5278066 106.2 -1.6721934 9 6 3
103.9716696 108.9 -4.9283304 11 12 -1
104.698509 104.4 0.298509 7 8 -1
105.4172964 106.8 -1.3827036 8 10 -2
108.2497704 105.8 2.4497704 6 4 2
115.0652394 110.9 4.1652394 4 7 -3

The 3.5 data showed extremely similar to the 3.66 data as both finished with the same average as our simply method, 1.2 average difference. So by doing it this very long and complicated method we have only matched our very simple method for determining where a team will finish. It may be slighly better using our more complicated method, but only numerous more trials on other data sets from years past could show for sure. For now, its safe to say for the quickest, easiest measurement of how a team will do look at their Points Scored vs. Points Surrendered Per 100 Possessions.

There is still one problem with that method though, what if you just have one team's data and want to see how they are doing/likely will do the rest of the year. How do you know how those numbers translate into actual victories for the team at the end of an 82 game season. That answer tomorrow.

This post didn't lend itself to my picture laden style, sorry.

Blog Archive