Meta’s Muse Spark: A Closed-Source AI Model
Meta has released Muse Spark, the inaugural model from Meta Superintelligence Labs, a division established after acquiring a stake in Scale AI for $14.3 billion. Developed from scratch over nine months, this model is multi-modal, featuring a "Contemplating" reasoning mode that operates sub-agents in parallel, and it now powers Meta AI across various platforms. Unlike Meta's previous Llama models, Muse Spark is closed source.
A Nine-Month Journey
As Alex Wang, Meta's first-ever Chief AI Officer, noted on X, "Nine months ago we rebuilt our AI stack from scratch... New infrastructure, new architecture, new data pipelines. Muse Spark is the result of that work, and now it powers Meta AI." This statement underscores the comprehensive nature of the rebuild, moving beyond fine-tuning to replace foundational infrastructure.
Overcoming Challenges
Internally codenamed "Avocado," the model faced delays earlier this year due to underperforming in internal tests for reasoning, coding, and writing. However, Wednesday's release suggests these issues have been addressed, positioning Muse Spark as a competitive offering despite mixed benchmark results.
Key Features and Innovations
- Multi-Modal Capabilities: Accepts voice, text, and image inputs with initial text-only output.
- "Contemplating" Mode: Orchestrates multiple sub-agents for parallel reasoning, competing with Google's Gemini Deep Think and OpenAI's GPT-5.4 Pro.
- Efficiency: Achieves reasoning using 10 times less compute than Llama 4 Maverick through "thought compression," a training technique penalizing excessive thinking time.
Benchmark Performance
Meta's published benchmarks place Muse Spark fourth on the Artificial Analysis Intelligence Index v4.0 (score: 52), behind Gemini 3.1 Pro Preview, GPT-5.4, and Claude Opus 4.6. The varied rankings indicate a complex performance profile rather than consistent weaknesses.
On GPQA Diamond
Muse Spark scored 89.5% on the GPQA Diamond benchmark, a graduate-level scientific reasoning test, showing promise in advanced reasoning tasks.