NeuralCAx16 combines 256 multipliers with crafted data path design. It is flexible for any size of filter and channel configurations. Using the latest technology, NeuralCAx16 can run over 1 GHz and outperform many GPU/CPU CNN solutions.
A typical application of NeuralCAx16 is presented in Figure 1. The host processor can be a RISC engine, a microprocessor, or any in-house developed ASIC core. Its functions are: handling the memory data, performing other layers of operations , and interfacing to the real word.