Muse Spark

언어 모델
{{{#!wiki style="margin:-0px -10px -5px" {{{#!folding [ 펼치기 · 접기 ] {{{#!wiki style="margin:-5px -1px -11px; word-break:keep-all"	OpenAI	GPT (1/2/3/4/oss/5/6^{개발 중}) · (o1/o3/o4)
구글	Gemini(1/2/3) · Gemma · LaMDA · PaLM 2
Anthropic	Claude (Opus/Sonnet/Haiku)
xAI	Grok
메타	LLaMA · Muse Spark
기타	HyperCLOVA(네이버) · 카나나(카카오) · 삼성 가우스(삼성전자) · 엑사원(LG AI연구원) · 믿:음(KT) · A.X(SK텔레콤) · Solar(업스테이지) NVIDIA Nemotron(NVIDIA) · Phi(Microsoft) · OpenELM(Apple) DeepSeek · Qwen(알리바바) · 어니봇(바이두) · Kimi(Moonshot AI) 나마즈(Sakana AI)	}}}}}}}}}

Muse Spark Muse Spark
공개일	2026년 4월 8일
제작사	메타 초지능 연구소
기능	언어 모델
링크

1. 개요2. 제품

2.1. Muse Spark

3. 여담

1. 개요

Muse Spark는 메타의 언어 모델이다.

2. 제품

2.1. Muse Spark

2026년 4월 8일 공개되었다.

<rowcolor=#ffffff> 분류	Benchmark	Muse Spark Thinking	Opus 4.6 Max	Gemini 3.1 Pro High	GPT 5.4 Xhigh	Grok 4.2 Reasoning
MULTIMODAL	CharXiv Reasoning Figure Understanding	86.4	65.3 Self-Reported: 61.5	80.2	82.8	60.9
	MMMU Pro Multimodal Understanding	80.4	77.4	83.9	81.2	75.2
	ERQA Embodied Reasoning	64.7	51.6	69.4	65.4	54.1
	SimpleVQA Visual Factuality	71.3	62.2	72.4	61.1	57.4
	ScreenSpot Pro Screenshot Localization - With Python	84.1	83.1	84.4	85.4	—
	ZeroBench Multi-Step Visual Reasoning (pass@5) - With Python	33.0	—	29.0	41.0	—
TEXT/REASONING	Humanity’s Last Exam Multidisciplinary Reasoning (No Tools)	42.8	40.0	45.4 Self-Reported: 44.4	43.9 Self-Reported: 39.8	31.6
	Humanity’s Last Exam Multidisciplinary Reasoning (With Tools)	50.4	53.1	51.4	52.1	—
	ARC AGI 2 Abstract Reasoning Puzzles (Public)	42.5	63.3	76.5	76.1	53.3
	GPQA Diamond PhD Level Reasoning	89.5	92.7 Self-Reported: 91.3	94.3	92.8	88.5
	LiveCodeBench Pro Competitive Coding	80.0	70.7	82.9 Self-Reported: 78.2	87.5	74.2
HEALTH	HealthBench Hard Open-Ended Health Queries	42.8	14.8	20.6	40.1	20.3
	MedXpertQA (Text) Medical Multiple Choice	52.6	52.1	71.5	59.6	50.2
	MedXpertQA (MM) Medical Multiple Choice	78.4	64.8	81.3	77.1	65.8
AGENTIC	DeepSearchQA Agentic Search	74.8	73.7	69.7	73.6	62.8
	SWE-Bench Verified Agentic Coding	77.4	80.8	80.6	—	76.7*
	SWE-Bench Pro Diverse Agentic Coding	52.4	53.4	54.2	57.7	51.8*
	Terminal-Bench 2.0 Agentic Terminal Coding	59.0	65.4	68.5	75.1	47.1*
	τ²-Bench Telecom Agentic Tool Use (Artificial Analysis)	91.5	92.1	95.6	91.5	96.5
	GDPval-AA Elo Office Tasks (Artificial Analysis)	1444	1606	1320	1672	1055

Muse Spark

1. 개요

2. 제품

2.1. Muse Spark

3. 여담

분류