When NVIDIA launched Project DIGITS at CES 2025, I was genuinely excited. The promise was clear: a $4,699 desktop AI supercomputer with a GB10 Grace Blackwell chip, 128 GB of unified memory, 1 PFLOP of FP4 compute, and dual 200 GbE networking ports. I saw it as my ticket to running large language models locally without cloud dependency -- a tool for real decentralized AI development. As I reported at the time, this was supposed to democratize AI and put supercomputing power on your deskÂ[1].I placed my order and waited. When the DGX Spark arrived, I unboxed it with the same anticipation I've felt for every piece of serious hardware I've used to build my own AI infrastructure. But what I found inside was not a revolution. It was a machine where every single headline number comes with a devastating asterisk. The networking can't deliver its rated speed. The marquee NVFP4 software stack crashes in production paths. And the memory bandwidth is a hard ceiling that makes large-model decode painfully slow. I've spent weeks testing this machine, and I've concluded that NVIDIA shipped an unfinished platform and expected early adopters to debug it for them.The Networking Nightmare: A 200 Gbps LieThe dual QSFP ports powered by the ConnectX-7 NIC were supposed to let me link two Sparks and run models up to 405B parameters. In practice, the interfaces start to connect at 200 Gbps, but actual throughput caps around 13 Gbps -- an order of magnitude below the promise. The root cause is a PCIe power-budget bug: the driver detects insufficient power on the slot and throttles itself, even with NVIDIA's own supply. A recurring kernel message reads: "mlx5_pcie_event: Detected insufficient power on the PCIe slot (27W)." This is firmware lying to the driver.It gets worse. Even without that bug, the 200 Gbps per port claim requires multi-host PCIe aggregation that most users never configure. The GB10 SoC physically cannot provide more than PCIe Gen5 x4 -- about 100 Gbps -- to a single device. To reach 200 Gbps you have to bind both RoCE twins explicitly. And if you try to daisy-chain three Sparks, bandwidth halves to ~100 Gbps per pair, forcing an expensive switch. As ServeTheHome explained, the architecture is not a simple x8 link but two separate x4 connectionsÂ[2]. This isn't a rough edge -- it's a fundamental failure to deliver the marquee networking feature.NVFP4: The Feature That Doesn't WorkThe 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

I placed my order and waited. When the DGX Spark arrived, I unboxed it with the same anticipation I've felt for every piece of serious hardware I've used to build my own AI infrastructure. But what I found inside was not a revolution. It was a machine where every single headline number comes with a devastating asterisk. The networking can't deliver its rated speed. The marquee NVFP4 software stack crashes in production paths. And the memory bandwidth is a hard ceiling that makes large-model decode painfully slow. I've spent weeks testing this machine, and I've concluded that NVIDIA shipped an unfinished platform and expected early adopters to debug it for them.The Networking Nightmare: A 200 Gbps LieThe dual QSFP ports powered by the ConnectX-7 NIC were supposed to let me link two Sparks and run models up to 405B parameters. In practice, the interfaces start to connect at 200 Gbps, but actual throughput caps around 13 Gbps -- an order of magnitude below the promise. The root cause is a PCIe power-budget bug: the driver detects insufficient power on the slot and throttles itself, even with NVIDIA's own supply. A recurring kernel message reads: "mlx5_pcie_event: Detected insufficient power on the PCIe slot (27W)." This is firmware lying to the driver.It gets worse. Even without that bug, the 200 Gbps per port claim requires multi-host PCIe aggregation that most users never configure. The GB10 SoC physically cannot provide more than PCIe Gen5 x4 -- about 100 Gbps -- to a single device. To reach 200 Gbps you have to bind both RoCE twins explicitly. And if you try to daisy-chain three Sparks, bandwidth halves to ~100 Gbps per pair, forcing an expensive switch. As ServeTheHome explained, the architecture is not a simple x8 link but two separate x4 connectionsÂ[2]. This isn't a rough edge -- it's a fundamental failure to deliver the marquee networking feature.NVFP4: The Feature That Doesn't WorkThe 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

I placed my order and waited. When the DGX Spark arrived, I unboxed it with the same anticipation I've felt for every piece of serious hardware I've used to build my own AI infrastructure. But what I found inside was not a revolution. It was a machine where every single headline number comes with a devastating asterisk. The networking can't deliver its rated speed. The marquee NVFP4 software stack crashes in production paths. And the memory bandwidth is a hard ceiling that makes large-model decode painfully slow. I've spent weeks testing this machine, and I've concluded that NVIDIA shipped an unfinished platform and expected early adopters to debug it for them.The Networking Nightmare: A 200 Gbps LieThe dual QSFP ports powered by the ConnectX-7 NIC were supposed to let me link two Sparks and run models up to 405B parameters. In practice, the interfaces start to connect at 200 Gbps, but actual throughput caps around 13 Gbps -- an order of magnitude below the promise. The root cause is a PCIe power-budget bug: the driver detects insufficient power on the slot and throttles itself, even with NVIDIA's own supply. A recurring kernel message reads: "mlx5_pcie_event: Detected insufficient power on the PCIe slot (27W)." This is firmware lying to the driver.It gets worse. Even without that bug, the 200 Gbps per port claim requires multi-host PCIe aggregation that most users never configure. The GB10 SoC physically cannot provide more than PCIe Gen5 x4 -- about 100 Gbps -- to a single device. To reach 200 Gbps you have to bind both RoCE twins explicitly. And if you try to daisy-chain three Sparks, bandwidth halves to ~100 Gbps per pair, forcing an expensive switch. As ServeTheHome explained, the architecture is not a simple x8 link but two separate x4 connectionsÂ[2]. This isn't a rough edge -- it's a fundamental failure to deliver the marquee networking feature.NVFP4: The Feature That Doesn't WorkThe 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

The Networking Nightmare: A 200 Gbps LieThe dual QSFP ports powered by the ConnectX-7 NIC were supposed to let me link two Sparks and run models up to 405B parameters. In practice, the interfaces start to connect at 200 Gbps, but actual throughput caps around 13 Gbps -- an order of magnitude below the promise. The root cause is a PCIe power-budget bug: the driver detects insufficient power on the slot and throttles itself, even with NVIDIA's own supply. A recurring kernel message reads: "mlx5_pcie_event: Detected insufficient power on the PCIe slot (27W)." This is firmware lying to the driver.It gets worse. Even without that bug, the 200 Gbps per port claim requires multi-host PCIe aggregation that most users never configure. The GB10 SoC physically cannot provide more than PCIe Gen5 x4 -- about 100 Gbps -- to a single device. To reach 200 Gbps you have to bind both RoCE twins explicitly. And if you try to daisy-chain three Sparks, bandwidth halves to ~100 Gbps per pair, forcing an expensive switch. As ServeTheHome explained, the architecture is not a simple x8 link but two separate x4 connectionsÂ[2]. This isn't a rough edge -- it's a fundamental failure to deliver the marquee networking feature.NVFP4: The Feature That Doesn't WorkThe 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

The dual QSFP ports powered by the ConnectX-7 NIC were supposed to let me link two Sparks and run models up to 405B parameters. In practice, the interfaces start to connect at 200 Gbps, but actual throughput caps around 13 Gbps -- an order of magnitude below the promise. The root cause is a PCIe power-budget bug: the driver detects insufficient power on the slot and throttles itself, even with NVIDIA's own supply. A recurring kernel message reads: "mlx5_pcie_event: Detected insufficient power on the PCIe slot (27W)." This is firmware lying to the driver.It gets worse. Even without that bug, the 200 Gbps per port claim requires multi-host PCIe aggregation that most users never configure. The GB10 SoC physically cannot provide more than PCIe Gen5 x4 -- about 100 Gbps -- to a single device. To reach 200 Gbps you have to bind both RoCE twins explicitly. And if you try to daisy-chain three Sparks, bandwidth halves to ~100 Gbps per pair, forcing an expensive switch. As ServeTheHome explained, the architecture is not a simple x8 link but two separate x4 connectionsÂ[2]. This isn't a rough edge -- it's a fundamental failure to deliver the marquee networking feature.NVFP4: The Feature That Doesn't WorkThe 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

It gets worse. Even without that bug, the 200 Gbps per port claim requires multi-host PCIe aggregation that most users never configure. The GB10 SoC physically cannot provide more than PCIe Gen5 x4 -- about 100 Gbps -- to a single device. To reach 200 Gbps you have to bind both RoCE twins explicitly. And if you try to daisy-chain three Sparks, bandwidth halves to ~100 Gbps per pair, forcing an expensive switch. As ServeTheHome explained, the architecture is not a simple x8 link but two separate x4 connectionsÂ[2]. This isn't a rough edge -- it's a fundamental failure to deliver the marquee networking feature.NVFP4: The Feature That Doesn't WorkThe 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

It gets worse. Even without that bug, the 200 Gbps per port claim requires multi-host PCIe aggregation that most users never configure. The GB10 SoC physically cannot provide more than PCIe Gen5 x4 -- about 100 Gbps -- to a single device. To reach 200 Gbps you have to bind both RoCE twins explicitly. And if you try to daisy-chain three Sparks, bandwidth halves to ~100 Gbps per pair, forcing an expensive switch. As ServeTheHome explained, the architecture is not a simple x8 link but two separate x4 connectionsÂ[2]. This isn't a rough edge -- it's a fundamental failure to deliver the marquee networking feature.NVFP4: The Feature That Doesn't WorkThe 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

NVFP4: The Feature That Doesn't WorkThe 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

The 1 PFLOP headline number relies on NVFP4, NVIDIA's proprietary 4-bit floating point format. This is the feature most prominently broken in the shipping software stack. One customer who invested roughly $38,000 across nine Sparks publicly demanded a roadmap because the software promised by NVIDIA's marketing was not in a usable state. The bug surface is broad: Qwen3.5 NVFP4 models crash with CUDA illegal instruction errors on ARM64 GB10. Nemotron-3-Nano triggers cudaErrorIllegalInstruction during CUDA Graph capture for batch sizes greater than one. And MoE models hit misaligned-address errors because the workspace buffer doesn't meet stricter alignment requirements.Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

Even worse, for some time the SM121 architecture guards were missing entirely from vLLM's build system, meaning all NVFP4, CUTLASS, and MLA kernels were silently skipped at compile time. Users were running fallback paths without knowing it. This is not "early adopter pain." This is selling vaporware. I've seen unfinished software before, but NVIDIA has had a year since launch to fix these issues. As I've said before, AI hardware is only as good as the software that runs itÂ[3].The Memory Bandwidth TrapThe 128 GB of unified memory is the Spark's biggest selling point, but the 273 GB/s LPDDR5X bandwidth is shared between CPU and GPU. For token generation -- which is bandwidth-bound -- this is a hard architectural ceiling. On GPT-OSS 20B, the Spark hits 49.7 tok/s decode. A single RTX 5090 hits 205 tok/s. A Mac Studio M4 Max, at a similar price, has roughly double the bandwidth. The Spark's only real win is prefill, which is compute-bound. That's why the most efficient real-world setup involves pairing it with a Mac Studio -- a hybrid cluster that Exo Labs built and benchmarked. But that's not how the box was sold to you.This isn't a bug that firmware can fix. It's a design decision that NVIDIA's marketing buried under the headline "128 GB unified memory." Memory bandwidth is the bottleneck for running large models locally. I've built workstation clusters myself, and I know that raw memory count doesn't matter if you can't feed the GPU fast enough. The Spark forces you to choose between large memory or acceptable speed -- you can't have both.Value vs. Reality: What Else $4,700 Gets YouThe price jumped from $3,999 to $4,699 in February 2026. At that price, the Framework Desktop with AMD Strix Halo (128 GB unified, ~273 GB/s) costs $2,348 and delivers comparable token generation speed on large models. A used 3x RTX 3090 build under $2,000 triples the decode speed for models that fit. And the Mac Studio M4 Max at $3,999 has double the memory bandwidth. The Spark's only defensible niche is CUDA-on-ARM development with fast prefill -- but you have to wait for software fixes that may never come.John Carmack himself benchmarked his Spark and found it drawing only about 100W of system power, far below the rated 240W, with correspondingly reduced performance. Meanwhile, documented issues include faulty power supplies, bricked units after firmware updates, and complete network stack failure on arrival. As one forum user put it, the Spark is "an early-generation development platform that will probably get meaningfully better with another year of firmware and software work." But I don't pay $4,699 to beta test NVIDIA's faulty hardware.I believe NVIDIA shipped an unfinished platform and expected early adopters to debug it for them. The fixes that have arrived -- hot-plug support, Wi-Fi detection, memory reporting -- are table-stakes features, not solutions to architectural bugs like the 27W power throttle. If you're considering buying one, go in with eyes open: you're paying $4,700 to be a beta tester on a platform where none of the three top-line marketing claims hold up under scrutiny.I'm returning mine and sticking with non-NVIDIA hardware from here forward. The DGX Spark is a fascinating concept, but in its current state, it does not deliver on its promises. Decentralized AI development deserves better than this.Why I'm Done With NVIDIAI used to be a fan of Jensen Huang (he's from Taiwan, where I lived for a few years), but now I increasingly see him as a sales hypester pushing product fantasies that don't ship on time, that don't perform to spec, and that simply aren't worth the money. It all seems like vaporware gaslighting to pump up NVIDIA's stock price now.AMD's Strix Halo platform runs LLM inference models (even very large ones) at very close to the same speed as NVIDIA hardware, but for typically less than half the price. And they use a lot less power, too. Apple Mac hardware has much faster unified RAM and is becoming increasingly popular for AI projects, with long wait times due to the surge in orders from customers.Meanwhile, Nvidia keeps promising revolutionary chipsets that never ship on time and that utterly lack driver support or warranty support when they do actually ship. Just a couple of months ago, they DOUBLED the prices of their 5090 cards, raising them from around $2500 to $5000 with no real justification that made any sense. They were just price gouging their customers because they could get away with it.Like I said, I used to be a fan of NVIDIA. But something changed with them. It seems they abandoned quality control. They became obsessed with higher stock prices instead of higher quality products. They abandoned their customers and price gouged them wherever they could. Using NVIDIA products feels like you're fighting against a corporation that hates you.That's why I'm done with NVIDIA. From now on, I'm ditching NVIDIA and looking elsewhere. Intel, AMD, Apple, TensTorrent... I'll take my AI business to a company that actually cares about its quality, its warranties and its customers.ReferencesNVIDIA unveils Project DIGITS: a personal AI supercomputer for the masses. NaturalNews.com. Lance D Johnson. January 09, 2025.The NVIDIA GB10 ConnectX-7 200GbE Networking is Really Different. ServeTheHome. (URL provided in additional context).Brighteon Broadcast News - PHARMA DRUGS - Mike Adams - Brighteon.com, November 07, 2025.2025 11 20 BBN Interview with Aaron Day . Mike Adams.So Good They Can't Ignore You: Why Skills Trump Passion in the Quest for Work You Love. Cal Newport.Bright Videos News - OBSOLETE HUMANS - Mike Adams - BrightVideos.com, January 16, 2026.Health Ranger Report - No AI bubble - Mike Adams - Brighteon.com, November 04, 2025.Explainer Infographic:

Source: NaturalNews.com