Notes from the lab
Long-form benchmarks, buying guides, and write-ups on local AI hardware. Updated weekly.
Running Qwen3 235B on a single Mac Studio
We pushed Apple's M3 Ultra with 512GB unified memory to its limits. Here's what 22 tok/s of dense inference actually feels like.
Best GPUs for Running AI Models Locally in 2026: Ranked by tok/s per Dollar
We benchmarked 7 GPUs from $749 to $9,499 on Qwen3 32B with llama.cpp. The RTX 3090 at $749 used delivers the best value. The RTX 5090 at $1,999 is the best overall. Here is every data point.
RTX PRO 6000 Blackwell vs H100: Which One for Your Home Lab? (2026)
96GB at $8.5k vs 80GB at $30k. We profiled both on Qwen3 72B Q8 with llama.cpp. The RTX PRO 6000 wins on value. The H100 wins on throughput. Here is every benchmark.
The 2026 Used RTX 3090 Buyer's Guide: Mining Cards, OEM Pulls & What to Avoid
The RTX 3090 remains the best $/VRAM GPU for local AI in 2026. 24GB for under $800. Here is exactly what to look for, what to avoid, and where to buy.
The 2026 used RTX 3090 buyer's guide
Mining cards, OEM pulls, dual-fan vs blower — what to look for and what to avoid in today's market.
DGX Spark, three months in
128GB unified memory in a 1.2kg desktop. Worth $4k? Depends what you're optimizing for.
FP8 vs Q4: how much quality are you actually losing?
Perplexity isn't the whole story. We ran human evals across 6 quantization schemes.
Cooling Blackwell: the case for water
600W in a triple-slot air card means 600W in your office. Here's the AIO data.
Get the index
delivered Mondays.
New benchmarks, price drops, and one well-tested buying recommendation. No spam.