Introduction
Online communities have become essential social and civic infrastructure for our times. Increasingly, they serve as digital public squares—spaces where people exchange knowledge, form connections and support networks, debate norms and ideas, and organize collective action (Goldberg, Acosta-Navas, Bakker, et al. 2024).
At the center of these spaces are community moderators—volunteers who decide what content remains, what gets removed, and how conflicts are handled. As platforms expand and posting becomes easier through automation and new generative tools, the volume of material entering digital public squares has grown beyond what communities can easily govern.
There is growing pressure to automate moderation through AI tools that promise consistency, efficiency, and reduced human burden at a time when moderator burnout is a significant challenge (Gillespie 2020). The appeal is clear: automation scales beyond human capacity. But this promise comes with fundamental risks—that automation will undermine both trust in community moderation and undermine the very human qualities—empathy, collaboration, understanding—that make online communities thrive.
Resolving this tension and fulfilling the promises of AI-enabled moderation requires an in-depth understanding of how moderation happens within healthy online communities and where support would be most valuable. While social media platform policies offer a first layer of defense against many issues from child pornography to spam, online communities’ layer their own incredibly diverse rules and enforcement practices on top of what platforms provide.
We propose that the variance in how rules are created, implemented, and applied also requires reframing how community moderation is understood: not as a series of individual actions or simple rule enforcement, but as a form of collective intelligence: the shared capacity that emerges when people, norms, and technical systems interact to resolve problems no single actor could manage alone. In practice, moderation is not just a rule being applied to a post; it is a social process in which multiple actors interpret context, weigh consequences, and coordinate responses. Decisions emerge from the interplay between moderators, community members, platform policies, and technical systems (Gillespie 2018). In these ecosystems, moderators must negotiate shared norms, navigate edge cases, and adapt to shifting technologies. This collective intelligence cannot be fully grasped by looking only at individual decisionmakers or automated systems in isolation.
This study thus sought to address three key questions: (1) How do online communities actually practice moderation in their day-to-day operations? (2) How do collaborative processes enable effective community moderation? (3) How might AI tools be designed to support rather than replace these human-centered practices?
To answer these questions, ReD Associates, a human sciences-driven research and strategy firm, and Jigsaw, an incubator within Google applying emerging tech to civic challenges, teamed up to conduct an ethnographic study of online moderation. This work builds on Jigsaw’s broader portfolio of efforts to strengthen digital public spheres, including developing free tools that support healthy conversations, support large-scale democratic deliberation, and help people find common ground in contentious public conversations (Goldberg, Acosta-Navas, Bakker, et al. 2024).
Our study on community moderation spanned 13 communities across Reddit, Discord, Slack, Mastodon, and BlueSky. These communities were selected to cover diverse topics, such as location-based groups, recreation and special interests, and expression. We engaged with 39 participants in total—26 moderators and 13 community members—to triangulate our understanding of how moderation works from multiple perspectives.
In this case, we highlight three key methodological choices we made:
-
Taking on an ecological lens that framed moderation as collective intelligence rather than individual actions
-
Creating ethnographic exercises that revealed tacit knowledge and judgment calls around concrete instances of moderation (e.g., walking through specific enforcement decisions, tracing moderator journeys over time, observing backstage mod chats as they unfold, and comparing official rules to lived practices).
-
Conducting a two-day, in-person co-design workshop where moderators, product managers, and engineers collaborated to ground our insights and prototype new solutions.
While not novel in and of themselves, these methodological choices yielded novel insights for researchers and practitioners striving for healthy digital public squares. First, healthy communities prioritize shared purpose over rigid rule-following, with moderators adapting rules as flexible tools. For example, a subreddit with a “no self-promotion” rule allowed members to share personal projects during a designated “Promo Friday,” and a UFO discussion group created a dedicated conflict channel where rules were intentionally loosened to let tensions surface without spilling into the rest of the community.
Second, moderation decisions depend on multiple layers of context beyond content analysis, and thus require tools that support situational reasoning. It was common for moderators to view a user bio, poster history, adjacent threads, and even consult the internet to determine whether to remove a post or comment, revealing the contextual nature of moderation.
Third, our research introduced the original framework of “shielding” and “shepherding” as two distinct but complementary modes of moderation. Shielding refers to the removal of harmful content and bad actors, while shepherding describes the proactive, often relational work of coaching users, reinforcing norms, and shaping community culture. Moderators across platforms saw both as essential, but many described shepherding as the most meaningful part of their work—despite having limited time or support to do it well given the heavy demands of shielding members from harm.
These insights in turn drove clear product impact. Jigsaw’s product teams were able to translate these findings into a prototype for “customizable attributes”—a tool that lets moderators express their nuanced community guidelines in natural language and receive AI-generated scores on whether specific comments align (“Using LLMs to Support Online Communities” 2025). The prototype has since been tested with a major community platform, showing how research insights can be carried through into real-world moderation contexts.
This case study contributes practical approaches for studying collective intelligence in digital spaces, and shows how ethnographic methods can inform AI development that augments rather than replaces human expertise. The following sections detail the project context, research approach, key findings and their translation into product outcomes, and implications for future research.
Context and Problem Framing
This work arose from the product team at Jigsaw, an incubator within Google that explores emerging technologies and their impact on the world, translating research into tools and frameworks to inspire scalable solutions. As part of this mission, the Jigsaw team launched Perspective API in 2017, which uses content-based classifiers to detect attributes like toxicity, enabling platforms, publishers, and community managers to better host flourishing conversations. A common critique from users, however, was that these general-purpose classifiers often failed to capture the nuanced harms specific to their individual communities.
With recent developments in large language models, the team became interested in exploring how users could classify content more flexibly based on their own rules and needs. In addition, in 2024, dozens of subject-matter experts in online communities and the social sciences assessed AI’s role in the public square, and put forth opportunities including community-driven moderation with customizable classification tools (Goldberg, Acosta-Navas, Bakker, et al. 2024).
To better define this opportunity, the public “rules” from diverse online communities on platforms like Reddit and Discord provided a starting point to understand what kind of customization would be valuable. The team was also experimenting with prompts inspired by academic research to develop ways to flexibly classify different kinds of content from a community moderator’s perspective (Kumar, AbuHashem, & Durumeric, 2024).
Yet, judging the quality of such classification remained challenging due the lack of consistent, quality benchmarks across rule violations, beyond specific domains and contexts, such as toxicity in online comments (cjadams, Sorensen, Elliott, et al. 2017). Given the wide variation in community rules and values, it was clear that a purely technical approach to studying community moderation was insufficient (Weld, Zhang, and Althoff 2024). Further, even 1:1 internal interviews with Perspective API users proved insufficient for understanding “quality” as it existed within a community ecosystem, not just for one community manager or moderator, but for the community’s members as a whole.
These gaps pointed to the need for more in-depth ethnographic research to understand the evolving moderation workflows, values, and metrics that shape community moderation. In addition, we sought to understand from the communities’ own perspective, how effectively did they find AI could classify content based on their specific rules? These questions drove our multi-pronged research strategy.
Study Design and Approach
An Ecological Lens on Moderation
Understanding moderation as a form of collective intelligence required a research approach that moved beyond individuals and into systems. Traditional user research methods like surveys and interviews can be valuable for uncovering individual pain points and preferences, or identifying large-scale attitudinal patterns; but in the context of our research these methods risked flattening the complex, relational, and often unspoken practices that shape how communities actually govern themselves.
To uncover how decisions are made, how norms are sustained, and how tools mediate power, we adopted an ecological ethnographic approach. This approach was designed to accomplish two goals. First, it shifted the unit of analysis away from the individual and toward the collective—foregrounding the systems of relationships and distributed responsibilities that define moderation in practice. Second, it emphasized observational and behavioral insight over reported perceptions, since our interest was not just in what people said, but in what they did—how moderation actually played out in situ, and how AI tools might be used, reinterpreted, or resisted in practice.
This allowed us to study moderation not as a discrete action, but as a negotiated and systemic process shaped by the interplay between moderators, community members, and technical systems. The triangulation of multiple perspectives was essential—not simply to understand what individual moderators thought or did, but to surface the distributed nature of community moderation itself. Rules are not only written; they are interpreted, enforced, ignored, contested, and evolved. These dynamics involve power, hierarchy, and conflict. Understanding them required engaging not only those who create rules, but also those who enforce them and those who experience their effects. This was not just an anthropological exercise—it was essential for designing AI tools that support not just moderators, but entire communities.
Concretely, this meant that our study design was anchored on long-form semi-structured interviews with multiple people within the same community and who occupied different roles, held divergent perspectives, and engaged with moderation at different levels of proximity.
We embedded ourselves across 13 online communities of various sizes and thematic focus spanning Reddit, Discord, Slack, Mastodon, and BlueSky. In total, we engaged 39 participants, including community members and moderators, spending 2-4 hours with each of them. We deliberately sought participant diversity across the following parameters
Recruiting participants was a challenge, not only because community moderators are a relatively small number of people, but also because moderators were wary of speaking with us. Moderators often operate in high-visibility, high-stakes environments, often with little institutional support (Li, Hecht, and Chancellor 2022). Many expressed discomfort about being linked to AI research and concern about how their communities might be represented. In response, we approached recruitment not as a logistical task, but as a core part of study design. Gaining access required embedding ourselves within communities in sustained ways: joining chats, following group norms, and earning legitimacy through transparency and a shared purpose for improving community moderation. This work was time-intensive by commercial standards, but methodologically justified by the need to understand communities as ecosystems.
This ecological lens allowed us to see the interplay between front-stage decisions (i.e., what moderators enforce and why) and the backstage infrastructure and conversations that support them (i.e. content removal tools, moderator training guides, moderator-only discussion channels) (Star 1999). It made visible the disagreements, edge cases, and negotiations that typically go unseen.
These tensions were not noise in the system; they were central to how moderators governed their communities collectively. In our Findings section we show in detail how this approach allowed us to understand how shared decisions were negotiated in real time, how rules evolved through informal conversation, and how moderation tools were customized or circumvented to meet community-specific needs.
Making Tacit Knowledge Visible
Even with long-form interviews and field immersion, much of what mattered most was not immediately accessible. Moderation is full of tacit knowledge—judgment calls, community histories, shared codes, and technical improvisations (Dourish 2004). This presented a central methodological challenge: how to surface what moderators did and knew, but could not always explain.
To address this, we designed a set of ethnographic exercises that made invisible practices observable and discussable (Suchman 1987; Polanyi 2009). These exercises, conducted in 1-1 in-depth interviews with moderators and members of each community, emphasized concreteness by asking participants to walk us through real cases, trace their own development as moderators, or reflect on specific moments of collective decision-making.
These exercises offered a flexible methodological toolkit for studying systems that are informal, dynamic, and often undocumented. They helped move the research beyond claims or opinions and into the everyday logic that underpins collective decision-making. In doing so, they laid the groundwork for understanding how AI tools might support, not replace, the existing intelligence embedded in community practices.
Co-design as a Path to Imagination
Another challenge surfaced when we asked participants to imagine the future of AI in moderation. When prompted directly, most moderators described incremental improvements to what they already had: faster filters, more accurate removals, improved spam detection. These suggestions were anchored in the limits of what they had seen, and often tinged with skepticism or resignation. Many were actively wary of automation. Others simply felt too burned out to engage in speculative thinking, given the very practical demands of the moderator’s job.
Moderators often viewed AI with discomfort, seeing it as a risk to their communities and their roles, and viewed researchers (especially those connected to tech companies, like us) as intermediaries of those risks. This demanded that we intentionally allow space for critique, ambivalence, and uncertainty, while offering structured ways to shift from reaction to creation.
To generate actionable ideas in this context, we adopted a co-design approach (Zamenopoulos and Alexiou 2018). We organized a two-day in-person workshop that brought together nine moderators with Jigsaw’s product and engineering teams.
The first day of the workshop functioned as a panel for moderators to connect with each other, express shared concerns and build a perspective on what they hoped and feared both about AI and more generally about the job of the moderator.
Then, on day two, we prioritized co-design. We leaned on design exercises, specifically rapid ideation and the real-time testing of a speculative AI moderation prototype, involving research participants directly in the design process.
Rather than asking “what do you want AI to do?” the co-design approach allowed us to ask, “what’s hard about this work, and who or what helps you do it well?” This framing afforded a more expansive and grounded set of insights. Together, moderators, designers, and engineers, explored the potential role of AI as a tool for onboarding, team coordination, documentation, and decision context surfacing—areas where moderators lack support and where AI is poised to assist.
For instance, once moderator from a local community in an area prone to natural disasters wondered if AI moderation could be a fallback system during blackouts, showing both the importance and diversity of potential applications for AI in moderation:
“If it is overzealous that would be fine during a hurricane, [because during that time] you lost your entire mod team because we lost access to our community.” —Community moderator
These concepts went beyond the dominant assumption that the primary use of AI in community moderation would be in traditional content removal. By creating space for both critical reflection and concrete design, the workshop steered product development away from abstract utopian/dystopian framings and toward grounded interventions aligned with how moderators actually work.
Insights aside, one valuable outcome of this approach was that participants continued their collaboration post-workshop, forming an informal Discord and WhatsApp network to exchange tools, practices, and mutual support across platforms.
The co-design workshop was an extension of the ethnography itself. Moderators brought in real examples from their communities, tested in-progress prototypes, and generated design principles that reflected their values and needs. It was a space where product teams and moderators could meet as peers in design.
Findings
We do not present a comprehensive list of findings here. Instead, we highlight three grounded insights that demonstrate how an ethnographic approach can reveal the collective, tacit, and context-dependent nature of online moderation. Each insight draws a direct line from our methodological choices—an ecological lens, concrete exercises, and co-design workshops—to the kinds of knowledge they made visible.
Each insight opens with an analytical frame, then walks through a specific case that surfaced that pattern in practice. These accounts illustrate how community moderation is enacted not through rigid rule-following but through dynamic, situated, and collective reasoning. Finally, each case lands on product impact, showing how ethnographic methods can generate not just findings, but frameworks for rethinking the design of AI-assisted tools.
We begin with the problem of alignment: how moderators prioritize shared purpose over strict rule-following, adapting enforcement to fit the intended function of their communities. We then turn to context: how moderation decisions depend on layers of meaning—thread dynamics, user history, tone, and broader platform trends—that extend far beyond the content itself. Finally, we examine the distinction between shielding and shepherding: how moderation involves not only the removal of harmful content, but also the active work of coaching users, reinforcing norms, and supporting positive participation—work that moderators and members saw as central, yet largely unsupported by existing tools (Lo 2018).
These are not “best practices” for moderators. Rather they are grounded descriptions of the craft of moderation (as well as the craft of ethnography) as forms of collective intelligence.
Insight #1
At the start of our research, we asked a deceptively simple question: What makes an online community healthy? If we could understand how moderators define and sustain health, we could help design tools that support those “emic” definitions (Pike 1954). But very quickly, as we scoped the research, it became clear that asking people directly wasn’t enough.
Moderators would often give clear, confident answers about what health meant in their communities. They cited civility, topical focus, and an unwavering respect for rules. But in fieldwork, a more complicated picture emerged. As anthropologists and STS scholars have long noted, what people say they do often diverges from what they actually do in context, with action shaped by tacit knowledge, improvisation, and situational demands (Suchman 1987; Michael Polanyi 2009; Bourdieu 2020)
In one Discord community dedicated to discussing unidentified flying objects (UFOs), moderators were explicit about their expectations: no personal attacks, no spamming messages, no religion, and no politics. These rules were seen as necessary to maintain focus and avoid derailment. But when we joined the server and embedded ourselves in day-to-day interactions, we noticed something that didn’t match those stated boundaries. The community had a dedicated “conflict-resolution” channel—an active space where members were allowed to argue, air grievances, and clash directly. The rules weren’t absent there. But they were suspended, softened, made porous by context.
By “objective” standards, the conversations in the “conflict-resolution” channel violated several community rules: people spammed each other with messages and called each other names, often for hours. While the rules against spamming and personal attacks were particularly lifted, there were still implicit rules in effect: racist slurs and stereotypes and bullying were still diligently moderated out.
As we spent more time in this and other communities, we saw the same pattern repeated. Every one of the 13 communities we studied had what we came to describe as “release valves.” These were designated times or spaces where the written rules were relaxed. A subreddit with a ban on self-promotion had “Promo Fridays.” A community with strict guidelines around relevance had off-topic flairs. One moderator put it plainly:
“The flairs give people some room and the venting spaces give people a way to air their grievances, without the space getting out of hand.” —Community moderator
These weren’t lapses in enforcement. They were signs of a deeper logic: that rules were tools, not doctrine. Moderators treated them as flexible heuristics, applying them differently depending on who was involved, what the context was, and how the enforcement would impact the group’s function.
Over time, a higher-order insight became clear: a healthy community is not one where rules are rigidly followed, but one where purpose is coherently shared.
Purpose guides when and how rules are enforced or bent. While some moderators talked about “health” in terms of metrics like engagement or civility, those rules and metrics were in service of enabling a community that had a shared sense of purpose. Was the community offering the support it promised? Enabling debate? Helping people feel seen? Health in this view was more than a static condition—it was a dynamic negotiation between members and moderators around the function of the community.
This insight emerged not from what people claimed, but from close observation, contradiction, and immersion with both community moderators and members—made possible by our ecological approach and reinforced through exercises like community norm excavations and moderator journey mapping. It was visible only when we looked beyond the surface of stated policy into the lived practices and backstage improvisations that keep communities running.
While the insight was deeply contextual and ethnographic, the product implications were concrete. First, moderation tools needed to support rule flexibility, with features for designated exceptions, temporary overrides, or contextual toggles that mirror what moderators were already doing informally. Second, we identified a gap in the existing moderator toolkit: moderators needed tools that helped them articulate what their community was for, not just tools that helped them remove rule-breaking content.
Insight #2
Moderation tools today—especially automated ones—are built on a clear premise: that rule enforcement can be standardized (Gillespie 2020). Most products assume that if content can be parsed accurately, it can be judged automatically. Tools like automod scan for specific terms or patterns and, depending on configuration, remove or flag posts for review. From a product perspective, this makes sense: consistency is scalable, and automation reduces moderator burden.
But if one were to infer moderator needs based solely on these tools, they would miss something essential. The deeper reality that emerged from our research is that moderators do not operate in isolation from their tools. Instead, human judgment and automation function together as a distributed system of decision-making—a form of collective intelligence. Where tools operate on rules, moderators operate on meaning. That gap between syntax and sense is where moderation becomes both difficult and indispensable.
Across field sites, we saw that community moderation teams rarely assessed a post in isolation. Their decisions were shaped not just by the content of the post, but by its context—specifically, the intent behind the comment and the impact it might have on the community. These two lenses—intent and impact—were used again and again to guide enforcement, but they were never accessible through surface-level content alone.
In one popular subreddit, a long-time user posted the phrase “snitches get stitches” in a heated thread. The phrase, standing alone, didn’t clearly violate platform policy against apology to violence. But a senior moderator immediately flagged it—not because of the words themselves, but because of what they would likely trigger. From experience, they knew the phrase had been a spark for escalation in the past. While it was clear the user did not intend to initiate violence, the impact of the comment would likely be negative and so it was removed quietly, and no warning was issued. The decision was not about the phrase itself—it was about what the phrase might do next, given the thread, the user, and the moment.
In a different Reddit community focused on gender and identity, a user replied to a post about queer experiences with a long personal narrative that, at some point, contained the phrase “I’m a real woman.” The phrase was ambiguous. On its face, it could be read as an affirming identity claim—or as an exclusionary jab at trans women. The moderator checked the user’s profile, their post history, and the communities they frequented; indeed, the user was active in anti-trans spaces. As a result of this context, moderators interpreted the comment as likely transphobic and intended to provoke harm. The comment was removed and the user was removed under the “no discrimination” rule of the community.
These weren’t edge cases, but typical moderation flows. Moderators regularly consulted multiple layers of context to assess the intent and impact of content in their communities:
-
the content itself—what was said, and how;
-
the thread—what came (or could come) before and after;
-
the user—their role, history, and intentions;
-
the community—its norms, sensitivities, and political terrain;
-
the moment—platform-wide dynamics that might raise the stakes.
Our ecological approach made these practices visible. In interviews, through exercises like intervention play-by-plays and backchannel shadowing, we saw how decisions actually unfolded, following moderators as they performed these duties.
These ethnographic insights carried directly into the co-creation workshop. During the rapid ideation exercise, several moderators independently designed mockups of a “user summary” tool. Their prototypes included data points like a user’s join date, recent posts, prior moderation actions, and affiliated communities—indicators that they already relied on informally, but that current tools failed to surface in one place.
Later, in the prototype testing portion of the co-design workshop, participants interacted with an early AI “customizable attributes” prototype developed by the Jigsaw team. The tool offered two fields: one for past posts or comments, and one for community rules. Moderators could iteratively adapt the rule inputs to see how the system responded. Though still rudimentary, the prototype reframed AI not as an arbiter, but as a partner in judgment—one that needed clear boundaries and human calibration.
Moderators responded not only to the accuracy of the tool but to the control it offered. The ability to define rules and test their application in context reduced anxiety about automation. It grounded the AI conversation in a tangible workflow, shifting fears of replacement toward pragmatic questions of collaboration. Moderation, in practice, was a negotiation: between rules and people, tools and human judgment.
This insight had direct implications for product. Moderators weren’t asking for tools that removed content automatically. They were asking for tools that helped them make sense of content more quickly. Specifically, they needed systems that could:
-
surface relevant context (user history, prior warnings, past thread activity),
-
summarize prior decisions in similar cases,
-
flag patterns that moderators themselves had learned to notice.
In this vision, far from replacing human judgement, AI tools can help scaffold it. Moderators wanted tools that could assist with situational reasoning in a way that was contextual—a way to apply shared values across different cases.
Insight #3
Building on the previous insight, we observed that once moderators assess a post’s intent and likely impact, they make a second, critical judgment: should I shield the community, or should I shepherd it?
This distinction—shielding vs. shepherding—emerged organically in fieldwork and became a generative frame for understanding the scope of moderation. Shielding refers to protecting a community from harm: removing malicious content, banning bad actors, muting inflammatory threads. It’s the dominant model embedded in most platform tooling (Lo 2018). Shepherding, by contrast, involves guiding well-intentioned users toward constructive participation, reinforcing shared norms, and building alignment over time. It is just as central to the role—yet rarely supported by existing tools.
Moderators didn’t see shielding and shepherding as oppositional. They described them as complementary modes of action, often deployed in combination. Their decisions hinged on perceived intent: if someone was acting in bad faith, they shielded. But if someone seemed sincere—just misaligned or unaware—they took the time to shepherd.
In one small and intimate Discord server, we observed a moderator during a live session using our intervention play-by-play method. When a long-standing user posted a petition—violating the group’s no-solicitation rule—the moderator quietly removed the content, then added a comment explaining the rule and why it mattered. The goal of moderation, he explained, wasn’t to shame users but to remind everyone of expectations. Later in the same session, a conflict flared between two other regulars. Rather than escalate, the moderator temporarily muted both and messaged them privately to understand the situation and coach them on how to behave better. As he put it, “The point isn’t just to stop fights. It’s to remind people what kind of space this is. If they keep breaking the rules, then that’s a different story.”
While shepherding showed up across many communities and situations, it became particularly visible during moments of change—such as shifts in membership or community tone.
One Reddit community we studied had grown rapidly and now counted millions of members. The existing norms, once intuitive to a smaller group, became increasingly misaligned with the expectations and behaviors of newer users. Moderators found that many violations weren’t malicious—they stemmed from confusion about what the space was for. In response, they implemented structural changes: temporarily limiting who could post, adding pinned content to articulate norms, and creating onboarding pathways to help new users understand the community before contributing.
“We’ve gone from 15M to 20M just in just the past month…People are commenting without being subscribed – they haven’t read the rules…We added a flair for posts with a lot of rule-violating comments that basically only lets people who have been active for a while to comment.”—Moderator
This form of adaptive, longitudinal shepherding demanded more than reactive enforcement. It required moderators to interpret change, anticipate needs, and evolve their practices in real time.
Despite the importance of shepherding, moderators lacked tools that helped them with those parts of the job. During the co-design workshop, moderators generated many potential features and products that could support shepherding practices:
-
Pre-written, editable messages for rule explanations
-
Summaries of user history to support informed discretion
-
Dashboards that track shifts in tone, membership, or engagement
-
Content warnings to reduce emotional burden in reviewing flagged material
-
Infrastructure to ease transitions during periods of rapid growth
Crucially, what this showed was that moderators did not want shepherding to be automated. They valued it as the most human and often most rewarding part of the role. What they needed was support: ways to carry out this work with more consistency, less friction, and lower cognitive overhead.
This insight reframed how our product team thought about tool design. While moderation is often cast as a defensive task, focused on containment, moderators themselves saw it as generative and relational. Shielding tools should protect the integrity of the space, but there is a space for more shepherding tools that help nurture the behaviors and norms that allow communities to thrive.
Discussion
Our findings suggest that ethnographic approaches can deepen how community moderation is studied and how AI support tools are designed. One contribution lies in making tacit judgment visible. Exercises such as community norm excavation revealed the distance between written rules and actual enforcement. When moderators reflected on specific decisions in light of their own posted guidelines, they often encountered dissonance between stated intentions and lived practice. This echoes Suchman’s (1987) argument about plans and situated actions: formal policies only ever partially account for what people do in practice. For AI systems, this raises a practical and ethical question: should models be calibrated to published rules, or to the norms as moderators actually enact them? Tools trained on guidelines alone will inevitably diverge from human judgment for reasons invisible to the model.
A second contribution comes from co-design, which strengthened both analysis and impact. This is because co-design both gives agency to the end users to steer new ways of working themselves, and it closes the problematic gap between those building products and those using them (Bødker and Kyng 2018). By bringing engineers, product managers, and designers into the room alongside moderators and researchers, the needs, perspectives, and challenges of end users became directly visible to those responsible for development. This proximity enabled tighter, iterative product cycles and clarified where AI tools supported or strained moderation practice. One critical observation from this approach was that initial skepticism of AI from moderators was diminished through hands-on use and customization to their needs, suggesting trust in ‘black box’ AI tools can be built through use. The workshop also highlighted when moderators were most interested in integrating new tools to their repertoire: not as replacements for functioning practices, but as aids for onboarding new community moderators, for handling surges of content (e.g., during election season), and for “emergency moderation” when manual processes could no longer keep up (e.g., during natural disaster blackouts).
Finally, taking an ecological lens helped concretize our reframing of moderation as a distributed form of collective intelligence. By tracing how rules, tools, and relationships interact, we could see moderation not as individual judgment calls but as collective sense-making that emerges when people, norms, and technical systems coordinate to resolve problems no single actor could manage alone. This perspective made visible not only the contextual nature of moderation but also the power dynamics that shape interpretation and enforcement: We observed cases where two moderators, given the same evidence, reached different conclusions depending on their standing in the community. Members and moderators also diverged in how they understood a community’s purpose and norms. Such findings underscore why AI cannot be understood—or designed—simply as an autonomous decision-maker, but as one element within distributed systems of judgment. The question thus becomes not whether AI can moderate communities, but how it might support the collective intelligence that already exists while accounting for the social and political dimensions of community moderation.
Our research arrives as the field of content moderation is already moving toward more AI-based and context-aware tooling, though substantial room for innovation and moderator support remains needed. Platforms have extended moderation infrastructures from keyword filters to systems like Reddit’s AutoModerator and Discord’s AutoMod, while third-party services such as Bodyguard.ai, Hive Moderation, and Webpurify promise interoperable solutions across multiple platforms.
Ultimately a product was developed based on this research: Customizable Attributes. Customizable attributes enable users to provide their community’s guidelines in natural language and receive a score that reflects whether any given comment was consistent with them. This capability, built with the latest Gemini models, empowers any user, from mods to other community members, to highlight the comments they care about, which they can then label or analyze according to their own needs" (“Using LLMs to Support Online Communities” 2025).
The Customizable Attributes product was deployed and tested publicly with a major community-based social platform, demonstrating that there is demand for this kind of ecologically adaptive tooling across a range of federated communities and online forums. Early tests demonstrated both promise and limits: the potential for alignment with community-specific rules, but also concerns about misuse that informed the decision to deploy it with professional trust-and-safety teams rather than volunteer community moderators. Parallel efforts such as the Agile Community Rules Classification competition (Sorensen et al. 2025)—which challenges participants to train models on a dataset of real subreddit rules and moderated comments, providing a starting point for common types of community rules that communities can then further customize or refine—reflect growing recognition that benchmark datasets must capture the diversity of community rules, not only generic categories like toxicity.
Finally, our study highlights where current innovation in community moderation is uneven. Investment has flowed primarily into shielding tools—systems that remove harmful content or actors—while moderators consistently emphasized the importance of shepherding: coaching, onboarding, and sustaining culture (Lo 2018; Seering 2020). These practices were not marginal but central to how communities understood health, yet they remain largely unsupported by existing tools (including Customizable Attributes). The challenge ahead is to distinguish which aspects of shepherding can be augmented—pattern detection across conversations, retrieval of precedents, drafting explainers—and which must remain human, such as empathy, negotiation, and purpose-setting. In this sense, our findings complicate the prevailing narrative of scale and efficiency (Gillespie 2020), pointing instead toward questions of alignment, meaning, and care.
Conclusion
Moderation is not individual action, but rather collective intelligence—emerging from dynamic relationships between moderators, community members, and technical systems as they negotiate norms and adapt to challenges. By studying moderation through this lens, we were able to challenge prevailing approaches to AI-assisted moderation, which assume rule enforcement can be standardized and automated. Instead, AI’s greatest potential lies in augmenting collective human judgment—providing tools that enable community-specific customization, surface context to support situational reasoning, and support moderators’ “shepherding” efforts.
The translation of these findings into Jigsaw’s “customizable attributes” prototype demonstrates how ethnographic understanding can inform practical product development, positioning AI as a partner in judgment rather than autonomous decision-maker. Yet this work also reveals opportunities for future research, particularly in developing tools that support the “shepherding” work many moderators find most rewarding.
The stakes extend far beyond individual platforms. By understanding and supporting the collective intelligence that sustains healthy communities, we can ensure that AI serves not just efficiency, but also the deeper human values that make digital public squares worth preserving.
About the Authors
Ariel Abonizio is an anthropologist, artist, and business strategist working at the intersection of ethnography and emerging technology. He advises global technology companies on product and corporate strategy, with a focus on contextual AI, wearable technologies, and digital ecosystems. His work spans questions of trust in agentic AI, misinformation, transparency, cultural representation, and intimacy.
Beth Goldberg leads an interdisciplinary team of Google researchers and designers at Jigsaw, a Google unit that gives people agency over what comes next. Her team investigates and builds cutting edge technologies for the hardest civic challenges alongside academics, civil society, and technologists. Beth is also a lecturer at Yale Graduate School of Global Affairs on Disinformation & AI.
Emily Saltz is a Senior UX Researcher at Jigsaw, a Google unit that gives people agency over what comes next. Previously, she was a UX Researcher at the NYT R&D Lab, and a Fellow at Partnership on AI. She has a Master’s in HCI from Carnegie Mellon.
Katy Osborn is an Associate Partner at ReD Associates, a global strategy consultancy that helps corporate leaders, think tanks, and foundations drive growth and impact through deep human understanding. She specializes in aligning AI and tech roadmaps with human flourishing, grounding democracy strategies in lived experience, and building market strategies for Fortune 500 consumer tech, luxury, and retail companies through original cultural research.
Maya Potter is a Senior Consultant at ReD Associates whose work spans online safety, AI, and democratic participation and civic life. She has a background in Anthropology and Heritage Studies.
Research Ethics
This research was conducted under the ethical standards of the ICC/ESOMAR International Code on Market, Opinion and Social Research and Data Analytics, and in compliance with the General Data Protection Regulation (GDPR) of the European Union.
Generative AI Disclosure
Generative AI tools were used in two ways in connection with this research. First, closed-model systems (primarily Anthropic’s Claude and OpenAI’s ChatGPT) were employed during the original study to assist with data preparation, clean-up, and clustering in support of qualitative analysis. These tools helped organize and process large volumes of textual material, but all interpretation, coding decisions, and analytic framing were made by the research team, who stand fully behind the validity of the analysis. Second, generative AI also was used in the preparation of this article as a writing support tool to draft and revise sections of text under close author supervision. No generative AI systems were used to collect field data or to substitute for ethnographic interpretation. The authors remain solely responsible for the accuracy, originality, and integrity of the final manuscript.
Acknowledgements
This study was made possible through the contributions of many. Cameron Wu, Stella Dugall, and Millie Arora were integral to the research process—collecting data, conducting analysis, and co-developing several of the frameworks presented here. Lucas Mann and Thea Mann led the co-design workshop, shaping both the process and the materials that enabled moderators and product teams to prototype together. Most importantly, we thank the moderators who participated in this research—for their time, candor, and insight—without whom this work would not exist.
