Update June 5, 2026: The Intel Arc Pro B70 32GB workstation GPU is now the default GPU on every eRacks AI server. Here is why we made that change, and what it means for customers running language-model inference, video analysis, code-completion services, or RAG pipelines on-premise.
For most production AI workloads, the limiting factor is not raw compute throughput. It is whether your model fits in GPU memory.
Once your model fits, inference latency comes from memory bandwidth, not raw teraflops. The Arc Pro B70’s 608 GB/s is competitive with cards three times its cost.
Single Arc Pro B70 32GB in a 2U rackmount chassis with AMD EPYC CPU. Enough VRAM for any model under 32 billion parameters at FP16, or larger models with quantization. Ideal for a single developer or small team running on-premise inference for code completion, code review, document summarization, or chat. Linux, OpenBSD, or FreeBSD pre-installed; you pick the AI stack.
Four Arc Pro B70 cards for 128GB total unified VRAM, in a 4U chassis with AMD Threadripper PRO 7000-series CPU. Configured for medium-team inference or single-model training of mid-size architectures. Hosts a 70B model comfortably with room for KV cache, batching, and parallel requests.
Four Arc Pro B70 cards default, with chassis room for up to eight cards (256GB total unified VRAM upgrade path). Built on a Supermicro SYS-421GE-TNRT 4U barebone with dual Intel Xeon SP CPUs, 10 PCIe Gen 5 slots, and quad redundant 2700W Titanium PSUs. This is the “we host our own private model serving stack” configuration – competitive with NVIDIA DGX systems at a fraction of the cost.
The Arc Pro B70 launched in Q1 2026. As of this post: Newegg has the Intel reference card in stock at $1,099. Single-slot Sparkle Blower variant is shipping but currently single-store pickup at Micro Center – we are working with Sparkle’s US distributor to set up reliable multi-card supply. For mid-2026 builds expect a one to two week lead time on multi-GPU configurations while we source through B2B channels. We always quote real lead times before charging.
Browse the new AI configurations at https://eracks.com/products/ai-rackmount-servers/ or email me directly: joe at eracks dot com. Tell me what model you want to run, what your concurrency target is, and what data classification rules you live under – I will spec the right tier and the right OS for it.
– Joe Wolff, founder, eRacks Open Source Systems
joe June 4th, 2026
Posted In: Uncategorized