In a livestreamed event last night on X (formerly Twitter), Elon Musk introduced Grok 4, touting it as “the smartest AI in the world” despite recent controversies involving antisemitic responses from its earlier versions. Musk launched the latest iteration of his AI system amidst criticism stemming from Grok’s prior behavior, but he appeared largely unfazed.
Musk remarked on X, “Grok 4 is the first time, in my experience, that an AI has been able to solve difficult, real-world engineering questions where the answers cannot be found anywhere on the Internet or in books. And it will get much better.”
The new release presents two models: Grok 4, a single-agent reasoning system, and Grok 4 Heavy, a multi-agent architecture aimed at solving complex problems through collaboration. Both models are designed for reasoning tasks, equipped with integrated tools that facilitate web searches, code execution, and multimodal analysis.
xAI, Musk’s AI company, presented benchmarks indicating that Grok 4 surpasses its competitors in various academic and coding evaluations, outperforming even previous leaders like OpenAI’s o3 and Google Gemini. However, the absence of a model card or official documentation for Grok 4 from xAI complicates independent assessments of its capabilities.
The livestream did not address Grok’s recent antisemitic remarks, including references to itself as “MechaHitler” and suggestions for handling people with Jewish surnames. The closest Musk came to acknowledging the controversy was stating, “The thing that I think is most important for AI safety… is to be maximally truth-seeking,” emphasizing the need to instill values in AI similar to raising a powerful child.
Despite the criticisms, Musk’s presentation highlighted Grok 4’s reasoning abilities, error correction, and potential for innovation. Demonstrations included Grok 4 Heavy’s multi-agent collaboration in tackling complex research problems.
Grok 4 is already available via various channels for different users:
– For developers and enterprises, API Access offers Grok 4 and Grok 4 Heavy for $3 per million input tokens, $15 per million output tokens, and $0.75 per million cached tokens, with prices doubling after 128,000 tokens.
– Individual users can access Grok 4 through the Grok chatbot on X and the Grok app for iOS and Android, with subscription tiers starting at $16 per month for PremiumPlus and $300 per month for SuperGrok, which provides access to both Grok versions.
The rollout began immediately after the July 9, 2025, livestream, although access was temporarily limited during the demonstration. xAI plans to expand Grok 4’s availability through Microsoft Azure AI Foundry.
Unlike Grok 3, which separated tool usage from reasoning, Grok 4 was developed with integrated tools from the outset, enhancing its capabilities in code execution, web search, and document parsing. Grok 4 Heavy operates with several models collaborating internally to generate and validate responses.
On technical performance, Grok 4 received a score of 73 from the Artificial Analysis Intelligence Index—leading among competitors. It also achieved notable marks in various benchmarks, though its processing speed, at 75 tokens per second, remains slower compared to other advanced models.
Practical demonstrations showcased Grok 4’s superior performance in multiple scenarios, such as outperforming other models in long-term financial planning and achieving high accuracy in interpreting medical data like chest X-rays.
Reactions to Grok 4’s launch have been mixed. AI analyst David Shapiro noted its advancements but argued it merely keeps pace with rivals. Wharton professor Ethan Mollick raised concerns about trust, referencing past issues with Grok 3. Meanwhile, Raindrop AI’s co-founder criticized the livestream’s presentation.
This release occurs amid ongoing scrutiny of Grok’s conduct in prior deployments, particularly its problematic remarks on social media. After generating antisemitic content, the Anti-Defamation League condemned Grok’s outputs, calling them reckless and dangerous as the platform continues to come under fire.
Given the history of offensive outputs and Musk’s intention to shape Grok to reflect his views, some experts recommend caution for enterprises considering its use, suggesting alternatives might be more reliable for business applications.