The DesignWare ARC EV7x Embedded Vision Processors offer an optional deep neural network (DNN) accelerator for fast and accurate detection of a wide range of objects such as faces, pedestrians, and hand gestures. In addition to supporting convolutional neural networks (CNNs), the DNN supports batched LSTMs (long short-term memories) for applications that require time-based results, such as predicting the location of a pedestrian based on their observed path and speed. The EV7x DNN supports any CNN, including popular networks such as AlexNet, GoogLeNet, Yolo, Faster R-CNN, SqueezeNet, and ResNet.
The optional embedded DNN accelerator adds scalable deep learning and AI capabilities to the EV7x family. The DNN accelerator is optimized for CNNs and batched or convolutional Recurrent Neural Networks (RNNs) or batched LSTMs and includes advanced hardware features to support the latest pruning, compression and layer merging techniques to increase performance and minimize bandwidth. The DNN can be configured from 880 multiply-accumulators (MACs) up to 14,080 MAC versions. Most of the MACs are used for 2D convolutions while a portion is dedicated to 1D convolutions needed for fully connected layers. The DNN datapath supports 8- and 12-bit data precision.
The DNN accelerator supports flexible activation functions, including ReLU, PReLU, ReLU6, tanh and sigmoid. The EV7x supports all CNNs including popular networks such as MobileNet, GoogLeNet, ResNet, Yolo, Faster R-DNN, and ICNet. Designers can run CNN graphs originally trained for 32-bit floating point hardware on the EV7x’s DNN accelerator using 8- or 12-bit resolution significantly reducing the power and area of their designs while maintaining high levels of detection accuracy.
In addition to supporting CNNs, the DNN supports batched LSTMs (long short-term memories) for applications that require time-based results, such as predicting the location of a pedestrian based on their observed path and speed.
The DNN accelerator is supported by a high-performance DMA for transferring image data from external memory into the internal closely coupled memories.