New Israeli Study Predicts Who'll Win NBA Games, Based on Pregame Interviews

The Technion researchers’ system still needs improving but could lead to a revolution in data analysis – and of course sports betting

Itamar Katzir
Send in e-mailSend in e-mail
The Houston Rockets' James Harden reacting to a call during a game against the Portland Trail Blazers,   Lake Buena Vista, Florida, August 4, 2020.
Houston Rockets star James Harden. His stats are actually hard to predict.Credit: Kevin C. Cox / Pool Photo via AP
Itamar Katzir

If you told Shaquille O’Neal during his playing days that a special computer could analyze his pregame interviews and predict in which games he’d sink a three-pointer, he probably would have laughed you out of the room. Shaq hit three-pointers so rarely that he himself probably couldn’t have made such a prediction.

But a computer? Well, in 2020, the answer is yes, and the answer comes from Haifa, more specifically, the Technion’s Faculty of Industrial Engineering and Management.

It’s all spelled out in the article “Predicting In-game Actions from Interviews of NBA Players” in the journal Computational Linguistics. The new system could predict how NBA players would perform in a particular game based on their previous three games and pregame interviews.

It turns out that the words people use can affect the way they act. The system can predict if an NBA player will perform above or below his average on statistics like points scored or fouls committed.

“In many fields, like economics and sports, there are models that try to predict human behavior based on all kinds of performance metrics,” says Prof. Roi Reichart, who conducted the study with doctoral students Nadav Oved and Amir Feder.

In basketball, for instance, you examine how the player played based on all kinds of metrics from previous games – scoring average, rebounds and so on – while in economics you look at all sorts of market indicators or at how a specific stock is performing. These metrics generally don’t bring the personal narrative or mental or emotional state into the model, when clearly, in many areas in which people make decisions, maybe in every area, this is a very, very important aspect.”

A graph showing the accuracy of predictions of NBA players' stats, as published by Technion researchers in the journal Computational Linguistics in 2020.
A graph showing the accuracy of predictions of NBA players' stats, as published by Technion researchers in the journal Computational Linguistics in 2020.

The implications of the research are vast: predicting someone’s mental state based on word choice.

“It’s a bit of a window into the unconscious – like when you go to therapy and you don’t necessarily talk about this problem and that problem, but suddenly something else comes up. It’s a window into your feelings,” Reichart says.

“The really central thing here is that until now, people didn’t look at mental state as revealed by language as a way of predicting performance in sports. It’s a new opening. As a technology, this is the first time that it’s happening.”

Feder and Oved focused on basketball in part because they’ve had experience playing the game. “The disadvantage to talking with you on Zoom and not in the real world is that you don’t see how tall Nadav and I are – we’re friends from basketball,” Feder says.

A cool 60 percent

But basketball, especially the NBA, has other advantages – the high-quality and varied statistics and the many pregame and mid-game interviews.

Oved says basketball also provides an accurate research arena because it doesn’t involve laboratory conditions – the players have no research bias. Their only interest is to win; they don’t even know they’re being studied.

The Cleveland Cavaliers' Shaquille O'Neal in action in an Eastern Conference playoff game, Cleveland, April 27, 2010.
The Cleveland Cavaliers' Shaquille O'Neal in action in an Eastern Conference playoff game, Cleveland, April 27, 2010.Credit: Aaron Josefczyk / Reuters

The researchers’ system is based on an artificial-intelligence method called “deep learning.” The system analyzed the performances of 36 basketball players and 5,226 pairs of interviews and games. Based on pregame interviews, it had a 60 percent success rate in predicting when a player would deviate from his average on parameters such as three-point shots and points scored.

Predicting players’ actions based on past performance has just a 53 percent success rate, much closer to random odds.

“The key thing here was showing the importance of content that’s expressed in language and may reflect something about the mental state of people who are supposed to make decisions, and trying to integrate this in computational models that never included this variable before,” Oved says.

As seen in the attached graph, the system is better at predicting certain statistics for certain players. A black line in the center of the graph shows the average accuracy of the researchers’ system – about 60 percent. The chart shows the system’s accuracy in predicting seven different statistics for a raft of players: personal fouls (PF), points (PTS), field goal ratio (FGR), pass risk (PR), shot risk (SR), mean shot distance for two-pointers (MSD2P) and the same for three-pointers (MSD3P). The farther to the right the point on the graph, the better the prediction.

For example, the system is extremely good at predicting when centers will take three-point shots – something they rarely do. Thus the purple dots corresponding to the names of Shaquille O’Neal, Tim Duncan and Pau Gasol are way on the right.

“This means that maybe something special has to happen in their behavior for them to decide to shoot a three-pointer,” Feder says, adding that Shaq probably didn’t know he would be taking a three-point shot but the system did. Alas, the system is poor at predicting the number of points James Harden or Chris Paul will score in a particular game.

“It’s possible to discuss why this happens, but anyone who knows basketball would say that James Harden behaves in a very random way,” Feder says.

Details on the accuracy of predictions of NBA players' stats, as published by Technion researchers in the journal Computational Linguistics in 2020.
Details on the accuracy of predictions of NBA players' stats, as published by Technion researchers in the journal Computational Linguistics in 2020.

The power of positive thinking

Meanwhile, in pregame interviews, certain words correlate with better performance. Players who use words like “fun,” “star,” “great,” “good,” “enjoy,” “fan” and “basketball” tend to top their average point tally from their previous three games. Also, players who use words related to aggression tend to commit more fouls.

“A nice thing that happened here was that we didn’t tell the model that fouls mean aggression or anything like that,” Oved says. “The model deduced it.”

Reichart adds: “This is a very important point. We put the idea into the model but only retroactively did we analyze which words influenced its prediction about fouls or any other metric.”

Still, coaches, analysts or team officials aiming to predict players’ performances won’t be able to do so without the aid of the researchers’ model; Homo sapiens' analytic capacity falls short.

The Los Angeles Lakers' LeBron James, left, and Anthony Davis celebrating after defeating the Denver Nuggets in Lake Buena Vista, Florida, August 10, 2020.
The Los Angeles Lakers' LeBron James, left, and Anthony Davis celebrating after defeating the Denver Nuggets in Lake Buena Vista, Florida, August 10, 2020.Credit: Ashley Landis / AP

“If, say, a certain team wants to use this tool to predict its players’ performances, it won’t be able to just read the study and say, ‘okay, now we can make these predictions without using the system,’” Reichart says.

“But it does indicate that if you take this system and apply it to your team, do more interviews and provide history, you’ll obtain a better prediction of the players’ performance.”

The researchers’ next step is to understand just how the model produces its predictions – to understand its thought process, as it were.

“Two things here are super intriguing to us – one is this window into the unconscious, finding clues within language on strategic behavior and human behavior in general. This is a subject that’s being very actively studied in the lab,” Feder says.

“The other thing is just understanding what the hell happened here. Let’s say you managed to build a model that’s unexpectedly good – we’re in this weird period of scientific research where the models are a little bit ahead of our understanding of their capabilities. We want to improve our ability to predict strategic behavior and also understand what we’ve learned along the way.”

Until then, James Harden’s beautiful randomness will probably keep confounding researchers.

Comments