One of the creators of the Botometer—a web tool Elon Musk used to estimate Twitter’s spam percentage for a court filing—has reportedly said that Musk’s calculation “doesn’t mean anything.” Kai-Cheng Yang, a Ph.D. candidate at Indiana University, “questioned the methodology used by Mr Musk’s team and told the BBC they had not approached him before using the tool,” a BBC article said today.
A Musk court filing on August 4 claimed a Botometer analysis of Twitter firehose data in the first week of July “shows that, during that timeframe, false or spam accounts accounted for 33 percent of visible accounts.” But as Yang pointed out, the Botometer provides scores from 0 to 5—with 5 being the most bot-like—and Musk’s court filing didn’t say where he set the cutoff between human and bot.
“In order to estimate the prevalence [of bots] you need to choose a threshold to cut the score,” Yang told the BBC. “If you change the threshold from a three to a two then you will get more bots and less human.” Because Musk’s court filing “doesn’t make the details clear,” Musk “has the freedom to do whatever he wants. So the number to me, it doesn’t mean anything,” Yang said.
“Technically, you can choose any threshold you want and to get any result you want,” Yang said in an earlier interview with Yahoo. The Botometer is a project of the Observatory on Social Media and the Network Science Institute at Indiana University.
Botometer rated Musk a likely bot
The Botometer itself once “indicated that Elon Musk’s own Twitter account was likely a bot, scoring it 4/5,” as Twitter pointed out in a court filing. Musk’s Botometer score has reportedly fluctuated between 0.5 and 4, showing the tool rates Musk as human-like on some days and as more bot-like on others.
Twitter also pointed out that Musk and his team “have not indicated what score they are applying to conclude an account constitutes spam; thus, their allegation is unverifiable.” Twitter further noted that an account could be a bot without being what the company considers a fake account or spam. Twitter gave examples such as bots “that report earthquakes as they happen or updates on the weather.”
Other types of legitimate accounts can be seen as likely bots by the Botometer. The Botometer gave my own verified Twitter account a bot score of 3 out of 5 today, and it rated the verified Ars Technica account 3.6 out of 5.
The Botometer website’s FAQ cautions against labeling every account above a certain number a bot. “It’s tempting to set some arbitrary threshold score and consider everything above that number a bot and everything below a human, but we do not recommend this approach… We believe it is more informative to look at the distribution of scores over a sample of accounts,” the FAQ says.
Yang surprised Musk didn’t create a better tool
Yang also spoke to CNN recently, expressing surprise that Musk used the Botometer instead of creating something more precise. “To be honest, you know, Elon Musk is really rich, right? I had assumed he would spend money on hiring people to build some sophisticated tool or methods by himself,” Yang told CNN.
The Botometer is best used “to complement, not to replace, your own judgment,” the tool’s FAQ says, noting that “humans and machines have different strengths when it comes to pattern recognition. Some ‘obviously’ bot/human accounts according to a human observer will fool a machine-learning algorithm. For example, Botometer sometimes categorizes ‘organizational accounts’ as bot accounts. Likewise, an algorithm may confidently classify some accounts that humans have a hard time with.”
Twitter sued Musk in Delaware Court of Chancery after he tried to get out of his commitment to buy the company for $44 billion. Musk has defended his attempt to break the merger agreement by questioning Twitter’s public disclosure that less than 5 percent of its monetizable daily active users (mDAU) are spam or fake.
Twitter defends the accuracy of its estimates, saying they are based on “multiple human reviews (in replicate) of thousands of randomly selected accounts each quarter using both public and private data.” Twitter also says Musk has no right to exit the merger agreement based on the number of spam accounts.
Musk has plans for a more thorough spam analysis, his court filing said. “Defendants’ experts are continuing their analysis even now and, in anticipation of production of additional data by Twitter (including ‘private’ data that Twitter makes available to its human reviewers and contends is necessary to verify its reported less-than-5-percent spam and false user rate), intend to conduct a more comprehensive analysis and expect to present updated estimates and findings in expert reports and at trial,” Musk’s lawyers wrote.