Inference Appliance

The world’s first enterprise-ready Server, purpose-built for AI inference – Compact, Plug and Play, and Deployable in less than an hour

Get cutting-edge, high-performance AI for your on-premises Datacenter or in the Cloud. Pre-loaded with generative and agentic AI models, NR1 Inference Appliance, seamlessly integrates into existing AI infrastructure, enabling rapid scale-up or scale-out for data centers and cloud environments.By eliminating legacy CPU/NIC bottlenecks, it ensures near 100% GPU utilization, significantly reducing complexity and cost, enabling high-performance, efficient AI at scale.

Three Treasures in a Box

Reduced Total Cost of Ownership through Higher Scalability, Lower System Cost, and Lower System Energy Consumption, delivering Best-In-Class Inference Requests per Dollar

Reduced Total Cost of Ownership through Higher Scalability, Lower System Cost, and Lower System Energy Consumption, delivering Best-In-Class Inference Requests per Dollar

Vertically Integrated Inference Appliance experience with User-friendly Software Development Kit and Runtime, with API’s for Inference Development, Deployment, and Serving

Vertically Integrated Inference Appliance experience with User-friendly Software Development Kit and Runtime, with API’s for Inference Development, Deployment, and Serving

Built-in Optimized Open-Source Models; Fine tuning and Bring your own Model support; OpenAI API Ready; Backend support to all Agentic AI frameworks

Built-in Optimized Open-Source Models; Fine tuning and Bring your own Model support; OpenAI API Ready; Backend support to all Agentic AI frameworks

Background

Competitive Advantage

Our first cloud computing and financial services customers are currently running NR1 Inference Appliances on site – demonstrating accelerated systems performance, energy efficiency and real estate/space savings versus traditional x86 CPU-reliant inference systems

2.5X

Energy efficiency

2X

SERVER DENSITY

6X

Cost efficiency

Background

Specifications

Mechanical Form Factor

4U, 19” Rack Mount

PCI Express Capacity

20 slots of dual slot FHFL x 16 PCIe Gen5 card, accommodating 4-10 NR1 Inference Modules (AI-CPU) and 10-16 GPUs in 1 chassis, 5 switches

Performance

  • Neural Network Processing of up to 14POPs, DDR of 2.05TB / 8.8TBps and SRAM of 9.22GB
  • General Purpose Processing of 80 x Arm Neoverse N1 cores
  • 160 General Purpose DSP cores
  • 160 Audio DSP cores
  • 40 Video engines for 40K FPS@HD, 20K FPS@FHD,
  • 5K FPS@4K, 1200 FPS@8K

Networking

Up to 1 Tbps + redundancy

Host Memory

Up to DDR capacity of 1.6TB; DDR Bandwidth 2.56TBps

Storage

Up to 10 x 3.84TB E1.SSDs

Power

2+2 Redundance mode, Typical: 2.85KW

Cooling

6 modules, each 2x60x60 dual rotor fans

Management & Monitoring

BMC with 2 x 1Gbps on rear panel (RJ-45) management ports

Software

Server configuration, monitoring, and network security

Want to Learn More?