Inference Appliance

The world’s first enterprise-ready Server, purpose-built for AI inference – Compact, Plug and Play, and Deployable in less than an hour

Get cutting-edge, high-performance AI for your on-premises Datacenter or in the Cloud. Pre-loaded with generative and agentic AI models, NR1 Inference Appliance, seamlessly integrates into existing AI infrastructure, enabling rapid scale-up or scale-out for data centers and cloud environments.By eliminating legacy CPU/NIC bottlenecks, it ensures near 100% GPU utilization, significantly reducing complexity and cost, enabling high-performance, efficient AI at scale.

Three Treasures in a Box

Reduced Total Cost of Ownership through Higher Scalability, Lower System Cost, and Lower System Energy Consumption, delivering Best-In-Class Inference Requests per Dollar

Vertically Integrated Inference Appliance experience with User-friendly Software Development Kit and Runtime, with API’s for Inference Development, Deployment, and Serving

Built-in Optimized Open-Source Models; Fine tuning and Bring your own Model support; OpenAI API Ready; Backend support to all Agentic AI frameworks

Competitive Advantage

Our first cloud computing and financial services customers are currently running NR1 Inference Appliances on site – demonstrating accelerated systems performance, energy efficiency and real estate/space savings versus traditional x86 CPU-reliant inference systems

2.5X

Energy efficiency

SERVER DENSITY

Cost efficiency

Specifications

Mechanical Form Factor

4U, 19” Rack Mount

PCI Express Capacity

20 slots of dual slot FHFL x 16 PCIe Gen5 card, accommodating 4-10 NR1 Inference Modules (AI-CPU) and 10-16 GPUs in 1 chassis, 5 switches

Performance

Neural Network Processing of up to 14POPs, DDR of 2.05TB / 8.8TBps and SRAM of 9.22GB
General Purpose Processing of 80 x Arm Neoverse N1 cores
160 General Purpose DSP cores
160 Audio DSP cores
40 Video engines for 40K FPS@HD, 20K FPS@FHD,
5K FPS@4K, 1200 FPS@8K