by Ole Pütz

Detecting Twitter Bots: a Test of Botometer with Bots of Different Complexity

In which way does different behavior of bots influence their detection by the popular Twitter bot detection framework Botometer? This question was addressed in a Master‘s thesis written at Bielefeld University by Merle Reimann.

Botometer gives the analyzed accounts a score between 0 and 1 (on the website 0-5) where 0 stands for a human and 1 for a bot account. In the beginning of September, Botometer was updated. The update introduced a new model, to improve the bot detection. Botometer now computes the score based on the probability that an account belongs to a certain type of bot class (Astroturf, Fake follower, Financial, Self declared, Spammer).

One of the experiments conducted for the thesis started when the new Botometer version was introduced. Four bots which tweeted based on templates and showed slightly different behavior were used. The first and the third bot tweeted every 12 hours, but the latter translated its tweets from English to German to French and back to English to change the content a bit. The second and fourth bot tweeted in irregular time intervals between 9 am and 8 pm, and again the latter bot translated.

Over the course of the experiment the second and third bot gained additional abilities. After following a number of users and unfollowing everyone who did not follow back, the bots started to retweet. Six days later, the two bots began to like tweets of other accounts. Both behaviors, retweeting and liking, improved the scores a lot compared to the bots which did not change their behavior, meaning Botometer indicated that the bots with the additional abilities were far less likely to be bots than the others.

It seems that Botometer is good at detecting simple bots, but has some problems with the detection of bots which show more complex behavior and do not belong to one of the bot classes used by Botometer.