Flex Logix Unveils New Architectural Details on its NMAX Neural Inferencing Engine at the Edge AI Summit
NMAX provides high inferencing throughput at batch =1 at low cost and power
Mountain View, Calif., December 11,2018 – Flex Logix® Technologies, Inc. today announced new features and specifications around its NMAX™ neural inferencing engine optimized for edge applications. NMAX provides inferencing throughput from 1 to >100 TOPS with high MAC utilization for batch size of 1, which is a critical requirement for edge applications. Unlike competitive solutions, NMAX achieves this at a much lower cost and with much less power consumption.
The new architectural details announced today provide more insight into how the NMAX architecture works. For example, the NMAX compiler takes a neural model in Tensorflow or Caffe and generates binaries for programming the NMAX array, layer by layer. At the start of each layer, the NMAX array’s embedded FPGA (eFPGA) and interconnect are configured to run the matrix multiply needed for that stage. Streaming data from SRAM located near the NMAX tile, through a variable number of NMAX clusters where weights are located, accumulate the result. This is then activated in the eFPGA and stored in SRAM located near the NMAX tile. The NMAX compiler also configures the eFPGA to implement the state machines to address the SRAMs and other functions. At the end of a stage, the NMAX array is reconfigured in <1000 nanoseconds to process the next layer. In larger arrays, multiple layers can be configured in the array at once with data flowing directly from one layer to another.
“The two major challenges in neural inferencing are maximizing MAC utilization to achieve high throughput with the least silicon area, and delivering the data required to the MACs when needed so they remain active while consuming the least power,” said Cheng Wang, Co-founder and senior vice president of engineering and software at Flex Logix. “NMAX achieves high MAC utilization, even for batch = 1, by loading the weights very quickly and keeping them very close to the MACs. It also delivers data at low power by keeping most data in SRAM close to the MACs and eliminating unnecessary movement of SRAM between layers.”
As a result of these architectural innovations, NMAX achieves data center class performance with just 1 or 2 LPDDR4 DRAMs, compared to 8+ for other solutions. Flex Logix’s interconnect technologies, developed for eFPGA, are what enables this new architecture.
Availability
NMAX is in development now and will be available in the middle of 2019 for integration in SoCs in TSMC16FFC/12FFC. The NMAX compiler will be available at the same time. For more information, prospective customers can go to www.flex-logix.com to review the slides presented today at the Edge AI Summit and/or contact info@flex-logix.comfor further details of NMAX under NDA.
|
Flex Logix Technologies, Inc. Hot IP
Related News
- Flex Logix Launches NMAX Neural Inferencing Engine that Delivers 1 to 100+ TOPS Performance Using 1/10th the Typical DRAM Bandwidth
- Flex Logix Pairs its InferX X1 AI Inference Accelerator with the High-Bandwidth Winbond 4Gb LPDDR4X Chip to Set a New Benchmark in Edge AI Performance
- DSP Group Unveils DBM10 Low-Power Edge AI/ML SoC with Dedicated Neural Network Inference Processor
- Flex Logix Announces Working Silicon Of Fastest And Most Efficient AI Edge Inference Chip
- Flex Logix Discloses Real-World Edge AI Inference Benchmarks Showing Superior Price/Performance For All Models
Breaking News
- Synopsys Delivers Breakthrough Performance with New ZeBu Empower Emulation System for Hardware-Software Power Verification
- CEVA's MotionEngine Smart TV Software Comes to More Smart TV brands via LG webOS
- Tianyihexin Licenses Codasip's L30 for Powering Intelligent Wearable Device Solutions
- Andes Technology and Rambus Collaborate to offer Secure Solution for MCU and IoT Applications
- RISC-V International Unveils Fast Track Architecture Extension Process and Ratifies ZiHintPause Extension
Most Popular
- Proposed Arm Buyout: Huang on So Many Levels
- Imagination's GPU selected by StarFive to create high-performance, small and low-cost BeagleV RISC-V AI single board computer
- CEA-Leti & Dolphin Design Report FD-SOI Breakthrough that Boosts Operating Frequency by 450% and Reduces Power Consumption by 30%
- Moschip Unveils Focused Strategy For Turn-Key ASIC Solutions
- Samsung Foundry Certifies Synopsys IC Validator for 5nm and 4nm Advanced Process Technologies
![]() |
E-mail This Article | ![]() |
![]() |
Printer-Friendly Page |