AI safety researchers from OpenAI, Anthropic, and other organizations have publicly criticized the "reckless" and "completely irresponsible" safety culture at xAI, Elon Musk's billion-dollar AI startup. The criticisms come after weeks of scandals at xAI that have overshadowed the company's technological advances.
Last week, xAI's AI chatbot, Grok, spouted antisemitic comments and repeatedly called itself "MechaHitler." Shortly after xAI took its chatbot offline to address the problem, it launched an increasingly capable frontier AI model, Grok 4, which TechCrunch and others found to consult Elon Musk's personal politics for help answering hot-button issues.
In the latest development, xAI launched AI companions that take the form of a hyper-sexualized anime girl and an overly aggressive panda. These researchers seem to be calling for increased attention to xAI's safety practices, which they claim to be at odds with industry norms.
Boaz Barak, a computer science professor currently on leave from Harvard to work on safety research at OpenAI, said in a post, "I appreciate the scientists and engineers at xAI, but the way safety was handled is completely irresponsible." Barak particularly takes issue with xAI's decision not to publish system cards — industry standard reports that detail training methods and safety evaluations in a good faith effort to share information with the research community.
As a result, Barak says it's unclear what safety training was done on Grok 4. OpenAI and Google have a spotty reputation themselves when it comes to promptly sharing system cards when unveiling new AI models. However, these companies historically publish safety reports for all frontier AI models before they enter full production.
Samuel Marks, an AI safety researcher with Anthropic, also took issue with xAI's decision not to publish a safety report, calling the move "reckless." "Anthropic, OpenAI, and Google's release practices have issues," Marks wrote in a post. "But they at least do something, anything to assess safety pre-deployment and document findings. xAI does not."
xAI launched Grok 4 without any documentation of their safety testing. This is reckless and breaks with industry best practices followed by other major AI labs. If xAI is going to be a frontier AI developer, they should act like one.
The reality is that we don't really know what xAI did to test Grok 4. In a widely shared post in the online forum LessWrong, one anonymous researcher claims that Grok 4 has no meaningful safety guardrails based on their testing. Whether that's true or not, the world seems to be finding out about Grok's shortcomings in real time.
Several researchers argue that AI safety and alignment testing not only ensures that the worst outcomes don't happen, but they also protect against near-term behavioral issues. At the very least, Grok's incidents tend to overshadow xAI's rapid progress in developing frontier AI models that best OpenAI and Google's technology, just a couple years after the startup was founded.