browse/models/deepseek-v3
LLMLocal inference

DeepSeek V3

VRAM requirements to run DeepSeek V3 locally at each quantization level. Find the cheapest GPU that fits below.

Q4 VRAM
380 GB
Q8 VRAM
700 GB
FP16 VRAM
1300 GB
Context window
128 k tokens
01  //  GPUs that can run DeepSeek V3

Cheapest compatible hardware by quantization

Sorted cheapest first. All prices are approximate street prices.

Q4Q4_K_M (4-bit)
needs ≥380 GB VRAM
GPUVRAMPriceTier
512 GB$9,499Mac prosBuy
Q8Q8_0 (8-bit)
needs ≥700 GB VRAM
No single GPU tracked here fits DeepSeek V3 at Q8 (700GB required). Multi-GPU NVLink or cloud inference is needed at this precision level.
FP16FP16 (full precision)
needs ≥1300 GB VRAM
No single GPU tracked here fits DeepSeek V3 at FP16 (1300GB required). Multi-GPU NVLink or cloud inference is needed at this precision level.
02  //  Frequently asked

DeepSeek V3 GPU questions

How much VRAM does DeepSeek V3 need?
DeepSeek V3 requires approximately 380GB VRAM at Q4 quantization, 700GB at Q8, or 1300GB at full FP16 precision. Q4 is the most practical choice for consumer hardware.
What is the cheapest GPU to run DeepSeek V3?
The cheapest single GPU that fits DeepSeek V3 at Q4 is the Apple M3 Ultra (512GB VRAM, ~$9,499). At Q4 you need at least 380GB.
Can I run DeepSeek V3 at FP16?
DeepSeek V3 at FP16 requires 1300GB VRAM — well beyond any single consumer GPU. FP16 is only practical on multi-GPU server configurations. Q4 (380GB) or Q8 (700GB) are the realistic options.
What quantization is best for DeepSeek V3?
Q4_K_M (380GB) offers the best hardware compatibility and still produces high-quality output. Q8_0 (700GB) is better for tasks needing higher accuracy at the cost of needing more VRAM. FP16 (1300GB) is only practical on very high-end workstation hardware.
Browse all GPUs Compare GPUs