GPU HUNTER/v0.4.1
BrowseCompareCalculatorBlog
⌘K
Find your GPU
GPU HUNTER

Independent benchmarks for local AI inference. Built for engineers who run models on their own metal.

Last sync · 2h agoAPI operational
Hardware
  • All GPUs
  • Workstation
  • Consumer
  • Apple Silicon
Tools
  • Compare
  • Calculator
  • Model Fit
Resources
  • Blog
  • llms.txt
© 2026 GPU HUNTER · Not affiliated with NVIDIA, AMD, or AppleSome links are affiliate links. We may earn a commission at no extra cost to you.build a3f4c2 · 2026.04.30
browse/apple/m3-ultra
MU
AppleMac StudioMac pros

Apple M3 Ultra

M3 Ultra · TSMC 3nm · released 2025-03

Up to 512GB unified memory. Run Qwen3 235B at Q8 — no other workstation can.

VRAM
512 GB
Bandwidth
819 GB/s
TDP
295 W
Qwen3 Q4
72 t/s
Score
86 /100
Current price
$9,499
Buy on Amazon Newegg Apple Store
price tracker
Price tracking coming soon
Coming soon
Affiliate link — we may earn a commission
01  //  Inference benchmarks

Single-stream decode · llama.cpp

Qwen3 32B · Q4_K_M
72 t/s
Qwen3 32B · Q8_0
44 t/s
Qwen3 32B · FP16
22 t/s
# env llama.cpp b4732 · 4096 ctx · batch=1 · prompt=512 · temp=0.0 · median of 5 runs
02  //  Hardware specs
ArchitectureM3 Ultra
Process nodeTSMC 3nm
Memory512 GB
Memory bandwidth819 GB/s
FP16 compute57 TFLOPS
INT8 compute114 TOPS
TDP295 W
PCIeUnified
Form factorDesktop
CoolingActive
03  //  Model fit

Approximate VRAM required to load weights + 4096 ctx KV cache.

Qwen3 32B
128k ctx
Q4
19 GB
FITS
Q8
36 GB
FITS
FP16
64 GB
FITS
Qwen3 72B
128k ctx
Q4
42 GB
FITS
Q8
78 GB
FITS
FP16
144 GB
FITS
Qwen3 235B
128k ctx
Q4
132 GB
FITS
Q8
240 GB
FITS
FP16
470 GB
FITS
Llama 3.3 70B
128k ctx
Q4
40 GB
FITS
Q8
75 GB
FITS
FP16
140 GB
FITS
DeepSeek V3
128k ctx
Q4
380 GB
FITS
Q8
700 GB
NO
FP16
1300 GB
NO
+ STRENGTHS
  • ✓512GB VRAM is enough for 200B+ models at Q4
  • ✓819 GB/s memory bandwidth · top tier in its class
  • ✓Strong tooling: FP16, Q8, Q4, MLX all officially supported
− TRADE-OFFS
  • −Draws 295W under load — plan PSU and thermals accordingly
  • −$9,499 puts this firmly in pro tier
  • −Mac-only — CUDA tooling won't run
04  //  You may also be considering
Open compare
RP6
RTX PRO 6000 Blackwell
96GB · $8,499
vs
R5
GeForce RTX 5090
32GB · $1,999
vs
R4
GeForce RTX 4090
24GB · $1,799
vs