How can research help us better understand online hatred and manage it effectively? Whose voices should we center in that work, and how might we need to think differently to create meaningful change?

I’m here at the Computer-Supported Cooperative Work conference in Minneapolis, where computer scientists and social scientists gather to discuss peer-reviewed research on technology and society. Today’s session is organized around the theme of content moderation, and I am honored to attend and liveblog the session.

“moderation is fundamentally about power.”

Sarah Gilbert


Speaking first is CAT Lab Research Director Dr. Sarah Gilbert. Sarah is presenting remarkable research that won the best paper award at the conference: Towards Toward Intersectional Moderation: An Alternative Model of Moderation Built on Care and Power.

Sarah opens by observing that content moderation is fundamentally about power. A lot of moderation is top-down from tech platforms and governments. A lot of moderation work is also bottom-up, led by individuals and communities. Neither system is working very well. People are hurt when there’s not enough moderation, and people are also hurt when there’s too much moderation. And in general, moderation technologies and practices tend to reinforce current power structures.

People have called for alternative models of moderation — ideas that are proactive, community-driven, justice based, and involve network ethics and transparency/accountability. The problem is that these ideas aren’t going to work if the people on the ground can’t implement them. And they can’t- often because there are messy power dynamics to navigate.

Sarah suggests that intersectional feminism is one model that could help us think more clearly about achieving a vision of moderation that works for us. To investigate this issue, Sarah carried out a project called “collaborative ethnography,” where you work alongside people to create knowledge together. In this project, Sarah worked as a moderator with the AskHistorians community on Reddit, a Q&A community on Reddit that answers people’s questions about history. Founded in 2011, this community is organized by a group of historians and tends to have an audience of young White men. 

To keep a community conversation going, moderators create unique rules. Thinking about the community as cultivation, moderators imagine their work as “curating the sub”, akin to “pulling out weeds so flowers can grow.”

To help us understand how intersectional feminism could incorporate ideas about power and care into moderation, Sarah tells us a story about how the AskHistorians community engaged with responses to community questions. She shares a story about the harmful colonial legacy of Christopher Columbus posted by two indigenous moderators on indigenous people’s day. Many commenters responded by defending Columbus, calling him a “giant of human history,” and erroneously crediting Columbus and colonization with bringing freedom and opportunities to colonized peoples. 

“marginalized moderators experience racism differently from dominant culture moderators- something that can partly be addressed through interpersonal care”

Sarah Gilbert

When the community’s indigenous historian moderators offered extensive, thoughtful responses to these comments, readers attempted to get these responses removed, arguing that they were impolite and against the rules. Other readers downvoted the indigenous historians’ work. After 10 hours of responding, one of the indigenous historians was running out of patience. Reflecting later about the incident, the moderator reported: 

“I recall talking privately… about how deeply frustrated I was feeling. It was getting late, and it had been a long day of receiving nonstop pushback and even aggression… It was jarring seeing people ignoring what I said, nitpicking at my words without any contextualization, and refusing to engage with my responses in good faith. It was exhausting.”

The moderator reported that responding to subtle racism was especially exhausting compared to outright bigotry, since it left room for argument, creating a

How can we make sense of this through intersectional feminism? Sarah starts out by outlining how interpersonal power affects the experience. She describes how marginalized moderators experience racism differently from dominant culture moderators- something that can be addressed by cultivating forms of interpersonal care. Power at a community level can also be an issue— when the majority in the community downvoted indigenous voices, it was a breakdown in majoritarian moderation approaches. Finally, Sarah talks about power at the systemic level, describing how the systems of Reddit (and academic history as a field) create the conditions for widespread misunderstanding, prejudice, and harassment.

Next up, we heard from Julia Sasse and Jens Grossklags, speaking about their work on “Breaking the Silence: Investigating Which Types of Moderation Reduce Negative Effects of Sexist Social Media Content.”

Julia starts remind us why sexist social media content is a problem and why it needs to be addressed. Julia describes sexist content as prejudiced content about people based on their sex or gender, content that includes from threats of violence, abusive language, nonconsensual images, and doxing. Julia reminds us that sexist content is common globally, and that it causes measurable psychological harm and causes people to drop out of important public conversations.

What can we do about this, and do those actions work? Julia describes two common responses to sexism online — deletion of content and counterspeech, where the perpetrator is publicly or privately reprimanded. Both of these interventions have the potential to change behavior by influencing norms – people’s beliefs about what others expect of them. CAT Lab has also published workk on social norms and moderation, so I was excited to learn more about the study.

Based on theories of social norms, Julia and Jens expected that deletion would provide a powerful normative signal. They also expected that the effectiveness of counter-speech would depend on whether content was removed. If the content was kept up, they thought, counterspeech might be less effective, since the act of permitting sexist attacks might counteract any normative influence from what moderators say.

To test this hypothesis, the researchers set up a simulated Facebook environment where participants were shown a case of sexist attacks on a user. Participants were randomly assigned to a 2×3 grid of conditions that varied whether something was deleted or not, and whether counterspeech was applied. I was especially fascinated by the the “private counterspeech option” which noted “(Moderation Team) The author of this message has been contacted.” 

To conduct this pre-registered experiment, Julia and Jens recruited 825 participants through Prolific with 412 women and 413 men.  They then surveyed people to understand their understanding of social norms and their sense of the tolerance toward sexist behavior. What did they find?

Deletion did reduce participants’ perceived prevalence of harmful/hateful behavior— in other words it made people believe it was less common. Similarly, they found that deletion reduced the perceived tolerance toward sexist behavior. On top of that, deletion increased people’s feelings of being safe in the environment and their intent to participate in the group. If sexism remained visible, public counterspeech is effective in increasing feelings of safety and the intent to participate in the group.

In an exploratory analysis, Julia and Jens also looked at whether people considered moderation to be fair. While participants considered deletion to be effective, they didn’t consider it to be fair— people preferred to be able to judge for themselves whether the deletion decision was justified. Deleting content is the reaction that does the most to reduce harm and also does the most to keep people safe, but it’s also the least trusted. Julia thinks there might be a trade-off between measures that make people feel safe and ones that feel fair and trustworthy to users. If so, that’s an important line of inquiry for the future.


Next up, Catherine Han presented a paper about hate raids on Twitch: “Echoes of the Past, New Modalities, and Implications for Platform Governance.” The paper was by Catherine Han, Joseph Seering, Deepak Kumar, Jeff Hancock, and Zakir Durumeric.

Catherine tells us about Twitch, a platform with 31 million average daily visitors and 7 million unique streamers. On Twitch, a person livestreams themselves and gives people a chance to watch and comment in a stream chat. Twitch streamers also apply categorical tags to their channel.

During the summer of 2021, Twitch streamers faced extreme flows of harassment, especially marginalized streamers. Catherine tells us that this kind of coordinated attack has a longer history that goes back to the origins of online chat on systems like IRC. She explains that by studying Twitch, she and her collaborators hoped to better understand coordinated attacks on wider platforms.

To do this, the team collected chats and metadata from 9,664 popular channels for two weeks, including 244 million messages, including roughly 3 thousand messages that were part of hate raids, including 60 cases in 57 channels. Here I should note that collecting this data and aggregating a dataset of hate raids is HARD. Here’s a chart that outlines how the team created this fascinating dataset:

When analyzing the data, the researchers noticed that hate raids largely relied on automated tools, with single-purpose accounts that were primarily used for the raid— this ends up being a very important detail.

Many of these raids were antisemitic, anti-Black and anti-Trans. 98% of the messages were anti-Black, and 73% of them offered violent threats. Strangely, only 11% of these hate raids were sent to Black streamers. Many of these messages were basically “canned hate,” and they included some techniques for detection evasion, including random noise and homoglyphs (replacing “ls” with Capital Is) designed to evade simple moderation tools.

The researchers also found that harassment was associated with tags on the Twitch site. In the period just before these hate raids, Twitch had created new categories for identity groups related to race, gender, and sexual orientation. In the research data, researchers found that 54 out of 57 targeted streams used at least one of these tags. In a statistical analysis, they found that Twitch’s attempt to make it easier to find marginalized communities on their platforms had made it easier to target them for hate.

Next, researchers looked at how streamers and moderation bot developers experienced these hate raids. Streamers thought that people who were vocally and proudly trans or Black or both, especially if they had a large audience. These streamers considered hate raids to be one part of a larger campaign of harassment. They described working to retain lawyers, coordinate resources for people to navigate hate raids, and develop moderation tools. Even though they knew these thrown-together tools had significant limitations, they appreciated having at least something to work with.

Overall, this study shows how platform design enables relatively naive automated harassment campaigns, says Catherine. The study also shows how communities were able to mobilize quickly for self-preservation in the shadow of failures and shortcomings from platforms.

Finally, Jie Cai reports a further paper on Twitch hate raids: Understanding Real-Time Human-Bot Coordinated Attacks in Live Streaming Communities. Jie also summarizes how Twitch works and the problem of hate raids on the platform. In theory, Twitch provides substantial resources to moderate harassment, from AI to platform staff and community moderation. Yet hate raids have persisted as a problem— Jie and colleagues set out to find out why.

To study the problem of hate raids, Jie and his colleagues analyzed Reddit comments that discussed hate raids on Twitch. These Reddit users described hate raids as pollution. For example, Twitch sends a notification every time an account follows your account. When large numbers of bots follow and participate in a stream, “it clogs up your followers’ alerts, which could last for minutes or hours.” These bots can then overwhelm a community with hateful comments, as Catherine described. Third, hate raiders sometimes tried to entrap users to get them banned — provoking them to respond to hate raids with frustrated or angry words that raiders could send to the platform to get the streamer banned. Fourth, hate raids wasted streamers’ time when they tried to respond to automated bots as if they were humans. 

What can designers do to mitigate the effects of hate raids? Jie describes several options that Redditors discussed:

  • positive encouragement and support for streamers who face these raids
  • proactive tools to prevent attacks in the first place
  • reactive tools during an attack (perhaps a panic button that activates multiple automated defenses, or account bans)

Overall, Jie observes that hate raids tend often focus on abusing platform features without violating moderation rules, creating an algorithmic confrontation between moderation tools and follow bots. In the meantime, marginalized streamers endure emotional, relational, financial, and physical harms. Jie calls upon platform designers to prioritize protections from abuse when developing platform software and policies.