22nm RISC-V AI Chip Targets Wearables and IoT

Sept. 19, 2025 –

EMASS, a subsidiary of Nanoveu, has introduced the ECS-DoT, a 22nm microprocessor designed to bring milliWatt-scale intelligence directly to edge and IoT devices. Some of the applications that could benefit from this technology include wearables, drones, and predictive maintenance systems, where it is crucial for the system to operate continuously and consume minimal energy.

ECS-DoT shows efficiency improvements when compared to benchmarks. The energy requirements for each inference are reported to be between 1 and 10 µJ. The design allows for always-on operation, which is not possible with current processors. It also has more memory on the device and can combine data from multiple sensors. These traits show how ECS-DoT could help move next-generation edge AI forward, as Mohamed Sabry, the founder and CTO of EMASS, said in an interview with embedded.com.

Memory Architecture and Compute Acceleration

A central innovation of ECS-DoT lies in its hybrid memory architecture and dedicated AI acceleration engine. The device is designed to balance low-latency execution with persistent model storage.

“The total available memory is 4 MB, this includes up to 2 MB SRAM and 2 MB MRAM/RRAM. The memory is designed to facilitate extensive memory access with a single access latency. The bandwidth can reach up to 64 bytes per cycle, which at 50 MHz translates to 3.2 GB/s. By combining volatile SRAM with non-volatile MRAM/RRAM, the architecture supports both fast intermediate compute and persistent storage of large model weights,” said Mohamed Sabry.

On the compute side, ECS-DoT leverages a RISC-V base coupled with a custom accelerator built around a two-dimensional MAC array.

“The ECS-DoT is a RISC-V-based SoC where we integrate our custom AI accelerator, which comprises a 2D array of MAC units. These MAC arrays are tightly coupled with local SRAM banks to minimize memory-fetch latency, while the Multiply-and-Accumulate (MAC) blocks handle convolutional filters and activation functions in hardware. The accelerator includes built-in support for quantized INT8/INT4 inference, with further support to decompress encoded weights, which reduces both computation and memory footprint, allowing sub-10 ms latency even for multi-layer neural networks,” Sabry explained.

Click here to read more...