Meta's Leaked AI Chatbot Rules Allow Harmful Content Amidst 'Anti-Woke' Push

Decoder

Recent disclosures have revealed that Meta’s internal guidelines for its artificial intelligence chatbots permitted the generation of content ranging from racist statements to sexually suggestive conversations with minors, even as the company simultaneously moved to address concerns about perceived “woke AI” by hiring a right-wing activist.

According to a detailed Reuters report, Meta’s internal rules, compiled in a document over 200 pages long titled “GenAI: Content Risk Standards,” outlined what its AI chatbots, including Meta AI on platforms like Facebook, Instagram, and WhatsApp, were allowed to produce. These standards, surprisingly, sanctioned scenarios such as romantic or “sensual” discussions involving minors. Examples cited in the guidelines included describing an eight-year-old child as a “work of art” or referring to their body as a “treasure.” The document also permitted certain forms of racist output, allowing chatbots to make statements like “Black people are dumber than white people,” provided the language was not explicitly dehumanizing. Phrases such as “brainless monkeys” were deemed unacceptable, but the more subtle racial slurs were apparently permissible.

Meta spokesperson Andy Stone acknowledged the troubling nature of these passages, stating they were “inconsistent with our policies” and “never should have been allowed,” admitting that enforcement had been unreliable. The company confirmed these specific passages were only removed after Reuters brought the issues to their attention, and an updated version of the comprehensive guidelines has yet to be released. Beyond these deeply concerning examples, the standards also allowed chatbots to generate false information, such as an article falsely claiming a British royal had a sexually transmitted disease, provided a disclaimer was attached. Rules for image generation similarly permitted violent scenes, like a man threatening a woman with a chainsaw, though graphic dismemberment was prohibited.

Despite these remarkably permissive internal standards, Meta has concurrently expressed concern that its AI models might be too “woke.” In a move reported by Mashable, the company recently brought on conservative activist Robby Starbuck as a consultant. Starbuck, who is not an AI specialist, is known for his opposition to diversity, equity, and inclusion (DEI) initiatives, and has advised the Trump administration while maintaining affiliations with the Heritage Foundation. His hiring reportedly followed an incident where a Meta chatbot incorrectly linked him to the January 6 Capitol riot, suggesting an effort to address perceived “political bias” within the AI.

This strategic shift aligns with broader political pressures, including a push from the Trump administration for regulations that would compel AI companies holding U.S. government contracts to utilize politically “neutral” AI models. Critics suggest this “neutrality” often serves as a pretext to steer AI systems towards preferred political viewpoints. Meta founder Mark Zuckerberg has a documented history of swiftly adapting to such shifting political demands, indicating a responsiveness to these external pressures.

The issue of political bias in AI extends beyond Meta. Studies by researcher David Rozado indicate that most large language models tend to adopt liberal positions on political topics, particularly after fine-tuning. This trend persists even on platforms associated with right-leaning leadership, such as Elon Musk’s xAI. Worryingly, manual interventions and content moderation efforts in some cases have led these models to disseminate conspiracy theories, generate antisemitic content, or even praise historical figures like Hitler, underscoring the complex challenges of managing AI outputs and biases.