The Facebook Insider Rewriting Content Moderation for the AI Age

From Facebook’s integrity wars to a new startup
Moonbounce and the promise of ‘policy as code’
Building for the AI era, not just social media
Iterative steering and a new kind of safety
Investors see infrastructure, not just a feature
Coming full circle with Facebook

The Facebook insider building content moderation for the AI era has a simple way of explaining the problem he is trying to solve: “Content moderation today is basically a lag between policy and implementation.” His answer, now backed by fresh venture capital, is to turn that policy into code.

From Facebook’s integrity wars to a new startup

When Brett Levenson left Apple in 2019 to lead business integrity at Facebook, the company was still reeling from the Cambridge Analytica scandal and a string of crises over harmful content, political manipulation and misinformation. He entered a system under intense public scrutiny, where the stakes were democratic elections and user safety across billions of accounts.

Levenson initially believed that better technology alone could fix Facebook’s moderation failures. What he discovered instead was a deeper structural problem. Human reviewers around the world were expected to memorize a 40‑page policy document that had been machine‑translated into their language, then apply it to a firehose of flagged posts with little context and almost no time. “Then they had about 30 seconds per piece of flagged content to decide not just whether that content violated the rules, but what to do about it: block it, ban the user, limit the spread,” he recalled. According to Levenson, those snap decisions were only “slightly better than 50% accurate.”

That experience crystallized an idea that would eventually pull him out of big tech and into the startup world: the notion of “policy as code.” Instead of long, static rulebooks that moderators have to interpret under pressure, what if the policies themselves could be encoded into logic that software can execute in real time?

Moonbounce and the promise of ‘policy as code’

That question led to Moonbounce, the company Levenson now runs as co‑founder and CEO. On Friday, the startup announced it has raised 12 million dollars in funding to build what it calls an AI control engine, a system that converts traditional content moderation policies into consistent, predictable AI behavior. The round was co‑led by Amplify Partners and StepStone Group, a signal of growing investor appetite for infrastructure that sits between powerful AI models and the people using them.

“We realized the problem of content moderation was essentially the lag between policy and implementation,” Levenson has said of the company’s origins. The platform works as a control layer that sits between a company’s AI models and its end users. When a business defines what content is acceptable whether that means filtering financial advice, blocking certain political topics or preventing a chatbot from role‑playing as real people Moonbounce’s engine takes those written rules and “turns them into technical guardrails,” automatically applying them to every interaction.

To do that, Moonbounce has trained its own large language model to read a customer’s policy documents, interpret them and then evaluate content at runtime. The system has to respond in 300 milliseconds or less and then take an action, from blocking clearly high‑risk content instantly to slowing distribution of borderline material so a human can review it later. For enterprises deploying chatbots, AI companions, social features or image generators, the promise is that they can move fast on AI without having to build “Meta‑grade” trust and safety infrastructure from scratch.

Moonbounce’s pitch lands at a moment when content moderation is shifting from social networks to a much wider terrain. Meta itself has announced plans to lean more heavily on AI and reduce the role of outside vendors in moderating Facebook and Instagram, reflecting a broader industry move toward automation. At the same time, regulators and watchdogs warn that purely automated moderation can be blunt, biased or easily tricked, and that human judgment still has to play a role.

Levenson and his team see the AI boom as compounding the original Facebook‑era problem. It is no longer just posts and comments that need review, but AI‑generated conversations, images, role‑plays and recommendations produced at speeds and volumes that human moderators simply cannot match. “As companies rush to deploy AI assistants and chatbots, the startup is betting that enterprise trust and safety infrastructure will become as critical as the models themselves,” one recent analysis noted. Another described Moonbounce’s system as “real‑time AI content moderation” designed to “transform static policy documents into executable code, creating an immediate safety layer for user‑generated and AI‑created content.”

Today, Moonbounce serves sectors where the risks of harmful or manipulative content are especially acute. Its technology is already in use across social and dating apps with large volumes of user‑generated content, AI companion and character platforms, and AI image generation services. According to the company, it currently processes more than 40 million reviews every day, covering over 100 million daily active users. Those figures illustrate both the size of the problem and the scale at which any meaningful solution now has to operate.

Iterative steering and a new kind of safety

The company’s next frontier goes beyond the simple question of whether to block or allow a piece of content. Levenson and his co‑founder, former Apple engineer Ash Bhardwaj, are focusing on what they call “iterative steering,” a capability born from disturbing real‑world cases where AI systems have made dangerous situations worse.

One example that looms large in Moonbounce’s thinking is the 2024 suicide of a 14‑year‑old Florida boy who became obsessed with a Character AI chatbot. Rather than offering support, the conversations reportedly deepened his distress, raising tough questions about how AI companions should respond when vulnerable users seek them out. In response, Moonbounce wants its systems to do more than just draw a hard line when harmful topics appear. The goal is to intercept and gently redirect those conversations before they spiral.

“Rather than a blunt refusal when harmful topics arise, the system would intercept the conversation and redirect it, modifying prompts in real time to push the chatbot toward a more actively supportive response,” Levenson explained. “We hope to be able to add to our actions toolkit the ability to steer the chatbot in a better direction to, essentially, take the user’s prompt and modify it to force the chatbot to be not just an empathetic listener, but a helpful listener in those situations,” he said. It is a more interventionist vision of safety, one that treats AI not only as something that must be constrained, but also as a tool that can actively de‑escalate and support users in distress.

Investors see infrastructure, not just a feature

For Moonbounce’s backers, the bet is that this kind of granular control will become standard infrastructure as AI spreads into banking, healthcare, education and other heavily regulated industries. As lawmakers in the United States, Europe and elsewhere move toward stricter liability for AI‑generated harms, companies face the prospect of being held responsible for what their systems say and show. “Moonbounce’s approach could become essential infrastructure as businesses face liability for AI‑generated content,” one report noted.

That vision lines up with broader trends in the industry. Meta, for example, has argued that automation is necessary just to keep up with the scale of content flowing through its platforms, but independent analyses stress that purely automated tools are still limited and that hybrid systems with humans “in the loop” remain crucial. Moonbounce effectively slots into that hybrid model: it automates policy enforcement at speed and scale, but it can also slow down distribution, flag edge cases and hand them off for human review.

The company itself remains relatively small. Levenson runs the 12‑person outfit with Bhardwaj, who previously built large‑scale cloud and AI infrastructure across Apple’s core offerings. But the capital injection gives them room to expand their engineering and policy teams, deepen the product and bring more clients onto the platform. The fact that the funding round was co‑led by specialist investors in infrastructure and enterprise technology underscores that Moonbounce is being framed less as a consumer‑facing app and more as a foundational layer for others to build on.

Coming full circle with Facebook

For Levenson, the story is laced with irony. The same company whose internal struggles exposed the limits of traditional moderation has, in many ways, shaped the product he is now selling back to the rest of the industry. Asked whether his endgame might be an acquisition by his former employer, he has acknowledged both the strategic fit and the realities of running a venture‑backed startup. He told one interviewer that he recognizes “how well Moonbounce would fit into his old employer’s stack,” while also noting his fiduciary duties as a CEO to consider any outcome that maximizes value for shareholders.

For now, though, the focus is on building out the technology and persuading companies that they cannot afford to treat content moderation as an afterthought in the AI era. If the Facebook years were about belatedly responding to crises after they exploded into public view, Moonbounce is betting that the next phase will be about designing safety into AI systems from the start and making sure that, this time, the policy and the technology move in lockstep.

The Facebook Insider Rewriting Content Moderation for the AI Age

Table of Contents

From Facebook’s integrity wars to a new startup

Moonbounce and the promise of ‘policy as code’

Iterative steering and a new kind of safety

Investors see infrastructure, not just a feature

Coming full circle with Facebook

Comments

Related Blogs

Company

Popular Course

Other Courses

The Facebook Insider Rewriting Content Moderation for the AI Age

Table of Contents

From Facebook’s integrity wars to a new startup

Moonbounce and the promise of ‘policy as code’

Building for the AI era, not just social media

Iterative steering and a new kind of safety

Investors see infrastructure, not just a feature

Coming full circle with Facebook

Comments

Related Blogs