NeuPro-S™ is a low power AI processor architecture for on-device deep learning inferencing, imaging and computer vision workloads.
While NeuPro-S provides a self-contained and specialized AI processor, it also supports heterogeneous co-processing with custom AI engines to enable additional customer differentiation and cover specific application needs, enabling it to fit a broad range of end markets including IoT, smartphones, surveillance, automotive, robotics, medical and industrial.
NeuPro-S builds on CEVA’s industry-leading position and experience in deep neural networks for computer vision applications. Dozens of customers are already deploying the CEVA-XM4 and CEVA-XM6 and NeuPro vision platforms along with the CDNN Compiler in consumer, surveillance and ADAS products.
This new AI processor architecture covers a wide range of processing options, ranging from 2 Tera Ops Per Second (TOPS) up to 12.5 TOPS per core and is fully scalable to reach above 100 TOPS using multi-core instantiations. NeuPro-S was designed to meet the most stringent safety compliance standards and comes complete with a full complementary software stack including CDNN, CEVA-CV, CEVA-SLAM SDK and Wide-angle imaging algorithms.
- NeuPro-S AI processor consists of NeuPro-S Engine and CEVA-XM Vision DSP
- NeuPro-S Engine - Specialized engines for Convolution, Activation and Pooling layers as well as weights decompression
- CEVA-XM6 - Fully programmable vector DSP for complementary NN functions, simultaneous processing of computer vision, imaging and customer extensions workloads
- Supports both 8-bit and 16-bit quantization mix to enable real-time decision tradeoff between precision vs. performance
- Supports multi-level memory system hierarchy enables multi-core scalability
- Optimized DDR bandwidth enabling weight compression and exploring network sparsity
- Advanced hardware DMA controllers for parallel processing and minimizing system overhead
- The NeuPro-S AI processor architecture includes the following processor options:
- NPS1000 includes 1024 8x8-bit MAC units
- NPS2000 includes 2048 8x8-bit MAC units
- NPS4000 includes 4096 8x8-bit MAC units
- Supports heterogeneous scalability and co-processing with custom AI engines to enable further customer differentiation
- The NeuPro AI processor family were designed to reduce the high barriers-to-entry into the AI space in terms of both architecture and software. Enabling an optimized and cost-effective standard AI platform that can be utilized for a multitude of AI-based workloads and applications
- Self-contained, unified imaging, computer vision and AI Processor in single architecture
- Unique 4096 native 8x8 MACS processing enabling up to 12.5 TOPS for single core and 100+ TOPS for multi-core instantiations
- System aware architecture, optimized for memory bandwidth, power and performance efficiency
Block Diagram of the Edge AI Processor Architecture for Imaging & Computer Vision