The Retrodiction Question
Carl Banks
Todd Beck
Jeff Bihl
Richard Billingsley
Bill Born
Lee Burdorf
Warren Claassen
Tim Clavette
John Coffey
James Howell

An Interview with

Jeff Bihl
Jeff's Computer Ratings
NCAA Football and Basketball, NFL

YL: Last year I tried to get Kenneth Massey to do a comparison page with college basketball as he has been doing with college football. He wasn't sure he could devote the time to it. Now I see that not only has he begun one but that you have one, too. How long have you been tracking college basketball in this format? Do you plan to expand to other sports?

Jeff: I started the comparison earlier this year. I think only about a week or two before Kenneth. I thought that a basketball comparison would be interesting and I asked Kenneth if it was OK to use a similar format. I think that reminded him he had been planned on setting one up. I kept putting mine up because once the parser is written it isn't hard to maintain. I might try another sport sometime. The parser would be easy to convert. It's really only interesting if you have at least 9 or 10 ratings and I think that college computer rankings are a lot more interesting than pro sports for several reasons.

YL: I've tried a different approach with my Superlists - ranking the teams with the median rank and not an average. A lot of teams would be a little difficult for me and wouldn't make enough difference to expend the effort. Different sports team raters have different ideas of what the most important factors are. What do you consider to be the most important factors?

Jeff: I can see how one would say that average ranking is not the best way. I was going to average team ratings instead but that leads to other problems such as how to scale each system. I have an average of the average ranking of the teams in each conference. The leads to even more problems. There may be more difference in strength in Division 1 Basketball between team #1 and #30 than between team #90 and team #230 but the latter will have a much bigger effect on an average. Rating would be a better option here if it were more feasible.
    I post result from two different systems with two different purposes that work on vastly different criteria. One is a pure predictor and the other is more of a retrodictor. The predictive criteria are not really subjective but were achieved through testing to see which parameters resulted in the smallest mean error on past data. The retrodictive system works by trying to answer the question of how difficult a team's win/loss record was to achieve against their schedule. The retrodictive system's criteria would be my main criteria if I voted in a poll. It ranks team based on what teams they defeated and who defeated them instead of who would win an upcoming or hypothetical matchup. These are two completely different things and that is why I have two systems. The retrodictive system is a lot more complicated than the predictor but it is not pure retrodiction as its intention is not to reduce ranking violation percentage.

YL: If you had to narrow it down to one perfect system, what would be the most important factors?

Jeff: I can really give a single answer to that. I don't think you have to say a system is good or bad because it may be good at one thing and bad at another. If you must judge a system you should judge it against the purpose for which it was written. I scaled the rating of my retrodictive system so that it would best fit a projected margin of victory but the system would make a very poor predictor if it were actually used for that. For the sake of a system that I would use for the purpose of the BCS then I would say wins and losses and strength of schedule. I think strength of schedule for this purpose can only be evaluated on a game-by-game basis. Averaging the strength of the teams on the schedule is oversimplifying. Schedule strength can be relative with respect to the strength of the team to whom the schedule belongs based on how the individual strengths of the teams on the schedule break down.

YL: What do you see as the future of computer ratings?

Jeff: The public still has not accepted computer rankings. For that to happen the press will have to become friendlier to them and the public will have to stop applying computer rankings for a purpose for which they were not written. I will use the general predictive vs. retrodictive comparison as an example. There may be a hard luck team with several close losses and several blowout wins that may really be the "best team" even tough there are teams out there with better records against similar schedule strength. If someone approaches a predictive ranking with the same mindset as the do the major media polls their opinion of computer systems in general will be soured. They probably either consciously or subconsciously recognize that "best team" and "most deserving team" are not one and the same but if they approach the ranking with the wrong mindset they won't accept it. I remember Florida being favored to win the Sugar Bowl over Florida St. at the end of the 1996 season even though Florida St. was undefeated and had already defeated Florida. Obviously, for Florida to be favored more than half the public believed they would win. Nobody complained that was unjust. However, at the same time many people would have found a ranking where Florida had a #1 next to their name prior to the rematch totally unacceptable. The concept of #1 already has a preset meaning. The reverse is true but less likely. If a retrodictive type system were used to try to predict outcomes it would usually not work very well and possibly sour a person on how well computers can predict games. Unfortunately, it is probably up to the press to explain that. I don't think most of the press understands that and if they did they would not want any change. For some reason the accepted ranking standard in college is a poll if the press. A press poll has its merits but I fail to see how anyone could say that it is the only possible method. The press used their power to convince the public of their own qualifications.

YL: What interests do you have outside computer ratings?

Jeff: I like computers and sports so the computer rankings were a natural interest. So are sports video games. Really, I just like games of all kinds.

YL: When did you begin to rank teams?

Jeff: I was pretty young when I first tried computer rankings. I first tried them in the mid-1980's on my Commodore 64 mostly using hypothetical situations or sometimes data from single high school conferences. I realized that the convergence of a system with a large number of teams would take too long for my patience on a Commodore 64. It was not until I got to college and had access to faster computers (486 systems) that I really started thinking about running a computer system on a entire 1A football schedule. The 1994 poll controversy piqued my interest enough to buy a USA Today and tediously convert an entire 1A football schedule and results into a entirely numeric data file. I posted weekly results on my dorm room door during the 1995 season. I first put the results for both the NFL and college on the internet for 1996 along with scores from a separate predictive system. I started college basketball near the end of the 1999-2000 season.

YL: Do you favor particular sports or teams?

Jeff: I still have some affinity for the Reds and A's left over from when I was young.

YL: What is your profession?

Jeff: Control Systems Engineer.

YL: Where do you call home?

Jeff: Right now Ohio is home but that will likely change again soon.

YL: Good luck. I hope it's a change you will find welcome. Is there anything else you would like us to know about you?

We would like to thank Jeff for the interview.

