Bandwidth ain't everything when even the M3 ultra had ~819 GB/s 512 GB
It's good for generation time. Apple M's are weak on compute side.
- Prefill → high compute device.
- Decode → high memory-bandwidth device.
DGX wins in raw compute and FP4 allowing large models. It's prefill performances likely not beaten until M6, if even that.
Good benchmarking and explanation here.
Disaggregating Prefill and Decode: Faster First Tokens, Faster Streams
blog.exolabs.net
So they are apples and oranges.
Head-to-head 2026 comparison of NVIDIA DGX Spark and Mac Studio M5 Max 128GB for local LLM inference, fine-tuning, and on-device RAG: throughput, memory...
presenc.ai
A complex agentic coding tasks where the agent has a big system prompt, reads in tool inputs and files, will perform better on Spark. If you want to actually train and fine-tune models (not just run them), the spark wins on compute alone.
These products are so niche and financially doesn't even make sense when you can probably buy a decade of cloud AI that will continue to advance at a rapid pace, over buying any of these solutions for local LLM, imo. Unless you're generating things that clouds would stop you doing...
those peoples.
The DGX spark actually made sense for network engineers to dev time on it locally and knew it was the same output for the datacenters. Not sure who will spend that kind of cash on that laptop.