Given the inherent interdisciplinarity of these issues, we believe that productive research must integrate competences from different scientific fields and provide a framework to facilitate cooperation.

To do so, our teams are assembled not according to disciplines but according to common methods (methods A to E). The three levels of analysis (I to III) involve the whole consortium. Tasks represent the intersection of methods and levels of analysis.

1. Methods

1.1 Method A: Data Analysis

Our technical innovations and experimental interventions will rely on a deeper theoretical under standing of social bots, online discussions, and echo chambers, which we will acquire through empirical analyses. Our plan is to integrate qualitative methods with network analysis and natural language processing. For example, the qualitative analysis of conflictual discussions will contribute to the development of our automated discussion monitoring tool, which will then both inform the quantitative analysis of echo chambers. We will focus on the OSNs Twitter and Reddit. Twitter has become a major platform of political communication and primarily a medium for information diffusion, whereas the aggregation and discussion of news on Reddit more closely resembles the discussion forums on traditional newspaper websites. The period before elections is where we expect online political deliberation to increase and malicious accounts to be most active. Our past work collected data sets about the 2016 US and 2019 European election, and we plan to collect a 2020 US and a 2021 German federal election data set. The vosonSML R package (Graham, Gertzel, Chan & Ackland 2019) will be used for the collection of the Twitter and Reddit data. Both Twitter and Reddit each support automatic data gathering and allow the use of bots, which are prerequisites for Method C and D.

1.2 Method B: Psychological Experiments

Psychological experiments will help us to untangle the complex interaction of technical, psychological, and social factors, allowing us to focus on specific variables. For example, the analysis of conflict sequences will result in hypotheses seeking to pinpoint the phases in a conflict where interventions are most likely to be successful. These hypotheses, once experimentally validated by varying the phases of conflict intervention, support effective computational intervention strategies. Experiments thus provide the link between analysis (Method A) and the development of technical solutions (Method C).
Experiments will make use of our benevolent bots as well as use a Wizard of Oz approach (Dahlbäck et al. 1993), that allows us to manually test functionality to be developed at a future time. Experiments will investigate interactions between bots and other users, for example to determine how the presumed identity of an account (i.e., bot or human) or the style of delivery affects message reception. Any insights derived from these experiments will then be used to develop our bots.

1.3 Method C: Development of Metrics and Bots

Method C consists of the development of metrics for social media monitoring plus the development of the technical basis for benevolent bots that can intervene in online communication processes. Building on our work (Graham & Ackland 2017; Veale 2015, 2018a) and that of others (Brooker 2019; Massanari 2017), we plan to use bots creatively to expose the nature of manipulative social bots, support public deliberation processes, and build bridges between digital communities. Beginning with out-of-the box features, we will increasingly take advantage of our own developments, relying on deep experience in our consortium in developing working Twitter bots (Veale & Cook 2018).
Method C will be informed by results from Method B, particularly as they relate to the development of bots for Twitter and Reddit, while also feeding into Method B by integrating bots into experiments. Method C will also be informed by Method A, when developing the discussion quality metrics for example. To facilitate an interdisciplinary and participatory technical design process, we will use design thinking and agile development methods, so that all project participants as well as civil society stakeholders can contribute to our technical developments and assess their capacity to contribute to political deliberation.

1.4. Method D: Social Media Monitoring and Experimental Interventions

While Method C is concerned with the development of technologies, Method D is concerned with its testing and deployment. Method D relies on the results from all the other methods. We plan two types of contributions: (1) improving and building systems for monitoring social media; (2) conducting experimental field interventions with benevolent bots. Concerning (1), our goal is to improve the BotOMeter system (hosted by our collaborators at Indiana University), and to build tools that enable both discussion quality monitoring and echo chamber monitoring, to be hosted by the CITEC at Bielefeld University and by the VOSON lab at Australian National University. Concerning (2), we will conduct exfield interventions in social media with the goals of mitigating the negative impact of social bots; assisting in the moderation of controversial discus sions; and disrupting echo chamber structures.

1.5. Method E: Practical Ethics in Dialogue with Civil Society

Method C and D will also benefit from our collaboration with civil society stakeholders with whom we will regularly organize workshops during the life span of the project (cf. appended Letters of Intent). The workshops will focus on participatory technology design (Schiffhauer et al. 2016; Schiffhauer 2018) and allow us to integrate stakeholder experiences of manually opposing antidemocratic tendencies in OSNs into the development of our automatic tools. Civil society stakeholders will in turn test prototypes of tools and provide valuable feedback. Our joint goal will be to develop tools that demonstrate real practical utility for the everyday work of these stakeholders.
With reference to the European Ethical Guidelines for trustworthy AI (HLEG 2019) as well as the ethical aspects of the AI-Strategy of the German Government (Bundesregierung 2018), we are devising ethical guide lines for our bots in collaboration with the ethics boards of our universities. As a direct consequence of what we consider to be issues with social bots and conflicts, our bots will always identify themselves as bots, and will never seek to deceive or intentionally harm human users. As we develop and test new capabilities of bots, their ethical implications will also be addressed and our ethical guidelines updated accordingly. Based on a better understanding of how both humans and technology jointly shape online political discourse, we will contribute in theory and practice to a key ethical question in contemporary societies: to what extent should technology be allowed to influence political deliberation.

2. Level of Analysis I: Bot Detection (Actor Level)

2.1. Task A.I: Identification of Communication Styles of Automated, Manual, and Hybrid Accounts:

Based on our preliminary work (Muhle et al. 2018, 2019; Muhle & Pütz 2019; Muhle 2020) we will develop and validate a typology of social media accounts, which differentiates actor types based on their purposes, as well as the degrees of automation and communication styles they exhibit. This will provide us with a better understanding of different account types on OSNs and will help to improve auto mated approaches for bot detection (cf. D.I).

2.2. Task B.I: Experiments about Effects of Bot Identification and Bot Personalities:

Research shows that bot messages are taken seriously even if bots identify themselves (Edwards et al. 2014; Spence et al. 2019). To increase the probability that bot content is accepted by other users as legitimate, we will test experimentally how different bot “personalities” are perceived by other users, in order to select the most successful personalities for other tasks. We have already explored what accounts are found to be bots if users are provided with information about them and will continue to do so experimentally.

2.3. Task C.I: Development of a Basic Framework for 3B Bots:

This task will develop a computational framework for conversational bots that can intervene in ongoing discussions on both Twitter and Reddit. As a proof-of-concept, this task will result in the implementation of a framework for developing and deploying such bots, as well as a set of benevolent bots that can interact with other bots as a first milestone. We will perform early tests with our bots to (1) interact with and identify potential malevolent social bots, (2) and to determine whether we can engage third parties in a discussion carried out by two of our bots as a basis for disrupting echo chambers (cf. C.III & D.III). By this we will acquire preliminary data and provide the technical infrastructure for work packages in Method C.

2.4. Task D.I: Improvement of BotOMeter Bot Detection:

BotOMeter is currently based on a supervised machine learning approach that leverages a large number of generic features. The typology of bot accounts and communication styles identified in Task A.I will allow us to improve BotOMeter precision and recall. The goal is to design focused features that capture behaviors of humans that are relatively easy to measure automatically, but difficult to reproduce for bots. Examples include the use of indexical or deictic references (e.g., Levinson 2004; Pütz 2019b) in tweets, the use of humor and irony (Buschmeier et al. 2014; Veale 2018b; Veale & Valitutti 2017a, 2017b), or the ability to engage in reciprocal exchanges with other users (cf. A.II). This analysis will result in the design of promising features as a basis to train classifiers for an improved version of BotOMeter that recognizes specific behaviors and classes of bots, thus avoiding the need for a single classifier to stretch itself to recognize all forms of automation.

3. Level of Analysis II: Interaction Analysis and Discussion Monitoring (Interaction Level)

3.1. Task A.II: Conflict Sequence Analysis and Argument Identification:

The definition of conflicts introduced in section 2.2 allows us to distinguish conflicts that are potentially beneficial or detrimental to political deliberation online. As a first step, we will extract discussions, under stood as reciprocal communication between users, from our data sets (cf. Muhle et al. 2018, 2019), while also identifying the contributions of social bots within these discussions. Analysis will then proceed iteratively and alternate between qualitative and quantitative modes of analysis. Controversial discussions will be analyzed using qualitative sequential analysis, with a focus on types of conflicts and argumentative strategies (Pütz 2019a). To analyze patterns of conflict quantitatively, we will explore the usefulness of statistical methods such as relational event modeling (e.g. Butts & Marcum 2017) and social sequence analysis (e.g. Abbott 1995; Cornwell 2015).

3.2. Task B.II: Experimental Intervention in Controversial Discussions:

The causes of conflict and incivility in social networks are manifold; the literature identifies various influencing factors, such as participant personalities (Ziegele et al. 2013), platform design (Frieß et al. 2018) or the news value of topics (Muddiman et al. 2017). While these aspects cannot be influenced by us, practice and research shows that manual moderation and argumentative/objective participation in discussions can have positive effects on the course of a conflict by raising the level of objectivity and politeness in a de bate (Ley 2018; Ziegele et al. 2018). So far, the moderation of discussions can only be automated to a limited extent (cf. Kiel Long et al. 2017; Seering et al. 2018) with limited topical scope (cf. Savage et al. 2016; Munger 2017). To test the potential for the technologically assisted objectification of controversial discussions, we will conduct experiments with different versions of a Wizard of Oz to simulate our own developments until they are ready for testing (cf. C.II).

3.3. Task C.II: Discussion Monitoring:

In this task we develop methods to monitor ongoing discussions on Reddit and Twitter. We will develop techniques to recognize phenomena such as disagreements, conflicts, as well as arguments (cf. A.II), which will allow us to provide metrics for the overall quality of discussions (cf. D.II). First, we will build on our experience on developing supervised methods to detect sentiment in social media and review data (Jebbara & Cimiano 2016, 2019; Klinger & Cimiano 2013) and extend these methods to monitor other relevant aspects beyond sentiment including use of insulting language, disagreement tokens, and stereotypes. Second, we will develop techniques to recognize arguments in Twitter as proposed by Dusmanu et al. (2017), and group them by similarity, building on our experience in developing methods for clustering social media posts by similarity (Reuter et al. 2012). By doing so, we will build on our theoretical considerations (cf. 2.2) and identify different types of conflicts as well as escalating conflicts in discussions. As we iteratively extend our methods for discussion monitoring, we will integrate these into early prototypes that can be evaluated by the consortium and civil society stakeholders. Furthermore as a preparation for Task C.III, we will also test what (combination of) methods are useful triggers for our bots to become active in ongoing discussions.

3.4. Task D.II: Tool for Monitoring Discussions and Identifying Arguments:

In this task we turn the metrics that measure the quality of discussions (cf. C.II) into a web application that can be used to monitor political discussions. For selected topics, the tool will be able to group arguments by similarity and commonness, enabling human users to manually intervene in discussions, i.e. by using dissimilar and uncommon arguments. The tool will be evaluated together with civil society stakeholders.

4. Level of Analysis III: Echo Chamber Disruption (Structural Level)

4.1. Task A.III: Analysis of Potential Echo Chambers and Identification of Active Bots:

Our project will apply network and text analysis to identify echo chambers, with an additional focus on the contribution of social bots to the formation of echo chambers (cf. D.I). We propose that a cluster of actors in social media is a necessary but not sufficient condition for the existence of an echo chamber, and sufficiency can be established via computational text analysis. Furthermore, information theory and text analysis can be used to establish whether there are statistically significant changes over time in information content on social media (Orlikowski et al. 2019). At a later project stage, the discussion quality metrics (cf. C.II) will also be used to analyze echo chambers. In addition to computational text analysis, we will use the methods developed in task A.II to consider how topics are distributed in echo chambers and whether they are discussed in a controversial fashion.

4.2. Task B.III: Adaptation of Manual Interventions and Automation Tests:

In echo chambers users are confronted with highly selective and one-sided news and opinions (e.g. Barberà 2018; Dylko et al. 2017, 2018). To disrupt echo chambers, a promising avenue is to encourage members to reflect on their own opinions. The confrontation with opposing political views, however, does not necessarily lead to a softening of political attitudes, but can also have the opposite effect (Bail et al. 2018), a risk we suspect to be high for echo chambers. The first goal of this task is therefore to adapt available manual instruments in the fight against incivility and hate speech to the context of echo chambers on the basis of our own work (cf. A.III) and expert interviews with civil society stakeholders. The second goal is to test experimentally what manual interventions can also be automated with bots and to assess the effects of this automation; features of our bots not yet available will be simulated by using a Wizard of Oz approach (cf. C.III).

4.3. Task C.III: Development of Bridging Bots for Echo Chambers:

In this task we develop the technical methods to intervene in echo chambers with our bots. Hypotheses about likely intervention methods will be derived from results of tasks A.III and B.III. A promising method is to inject arguments observed outside the echo chamber into the echo chamber. This could be achieved by having our bots contribute to ongoing discussions or by simulating a political discussion between two of our bots that members of an echo chamber can observe. Content creation for this simulated discussion can be based on task C.II by quoting (retweeting) arguments from other discussions. Although we cannot yet anticipate what methods will be successful, we will also test a playful and humorous framing of our methods, which may be successful in some scenarios where purely objective interventions are not (Veale 2019).

4.4. Task D.III: Deployment and Test of Bridging Bots in the Wild:

On the basis of the previous tasks, we aim to test our bots in existing echo chambers. The first goal is to identify social bots that are active in echo chambers (cf. D.I) and use our bots to mitigate their influence (cf. C.I). This may be achieved by providing information to users who are connected to social bots or by more complex strategies. The second goal is to intervene in political discussions in echo chambers in a controlled fashion. Based on our quality metrics (cf. C.II), we will identify communities with high quality discussions and diverse arguments, which can be injected in echo chambers (cf. C.III). Network analysis (cf. A.III) as well as our discussion quality metrics (cf. C.II) will be used to evaluate whether our bots can impact echo chamber structures in the medium term. Based on results from A.III, we will also expand our discussion monitoring tool to include metrics about echo chambers, which will be made available for civil society stakeholders to monitor and intervene manually in online discussions.