USB 2.0 PHY TSMC 5nm, 6/7nm, 12/16nm, 22nm, 28nm, 40nm, 65nm, 130nm, 180nm
LeapMind's Ultra Low-Power AI accelerator IP "Efficiera" Achieved industry-leading power efficiency of 107.8 TOPS/W
August 1, 2023 – Tokyo Japan - LeapMind Co., Ltd., a leading creator of the standard in edge artificial intelligence (AI) announced today that it has achieved an industry-leading power efficiency of 107.8 TOPS/W*1, 2 with “Efficiera”, an ultra-low power AI accelerator IP using the company's core technology, “Extremely low bit quantization”. When comparing this to similar devices in today’s market, 1 ~ 5 TOPS/W in the case of GPUs and about 20 TOPS/W for AI accelerators for edge devices, this means that Efficiera performance would be approximately 5~100 times higher than these devices at the same power consumption.
Hardware challenges with increasing size of AI tasks
In recent years, as the execution of AI tasks has become practical on various devices, AI tasks have become larger and more complex year by year. According to the analysis by OpenAI*3, the amount of AI computation used has dramatically increased at a rate of 2x in 3-4 months since 2012 (Moore's Law is 2x in 2 years), and it is worth preparing for the systems far beyond current capabilities. The analysis also notes that the processing of such AI tasks has been driven by custom hardware such as GPUs and/or more chips in parallel. This could indicate two technical challenges for the practical application of edge AI. First, as AI tasks become increasingly complex and larger in scale, the computational performance required during AI training and inference is much higher than before. The second is the challenge regarding the power consumption of devices with higher-performance AI chips processing complex AI tasks. The power available for an AI chip to process AI tasks stably in smartphones and AR glasses is less than 1W and around 25W for autonomous driving, where advanced AI processing is required. This means, edge AI also requires high power efficiency.(Figure 1)
Figure1 AI task and required performance
Reference:Power efficiency comparison (Based on in-house research)
Low-bit quantization technology, and a solution for enabling edge AI with “Efficiera”
Deep learning-based computations are usually performed with 32 bit floating point (FP32) to ensure accuracy. However, this method involves repeated matrix operations (sum-of-products operations) and requires a lot of computational resources, and several important requirements must be met in both software and hardware to perform AI tasks on edge devices, where memory and power are limited and real-time execution is required. The followings are some of the key requirements to be met. First of all, software requirements include lightweight models, model accuracy comparable to FP32, and model conversion to maximize the efficiency of AI accelerator execution. Second, hardware requirements include low power consumption, high performance for executing various AI tasks, less memory bandwidth, and less memory usage.
To enable edge AI, all of these requirements must be addressed simultaneously, but a general approach is to first use a technique called quantization to lightweight the model. Lightweight 8 bit quantization is commonly used for AI models running on edge devices such as smartphones, but even with this, the AI tasks viable for implementation are still limited to some tasks such as object detection and image processing of static images, from the standpoint of power consumption and heat dissipation. In order to implement various AI tasks, the challenge is that the power efficiency and hardware performance to support many of AI tasks are not sufficient for the scale of those tasks.
In response, LeapMind aims to enable practical AI on various devices, and to solve both software and hardware requirements with our strength, “Extremely low bit quantization” technology. As a part of our effortson the software side, we use this “Extremely low bit quantization” to reduce the size of the model, achieving an ultra-low bit quantization of 1-bit weight x 2-bit activation, and our patented proprietary quantization technology has brought us success in low-bit quantization with minimal degradation in accuracy. In addition, a model converter dedicated to "Efficiera" optimizes computing efficiency and memory transfer to maximize hardware execution efficiency. Furthermore, we have been developing the AI accelerator "Efficiera" as hardware for the efficient execution of Extremely low bit quantization models. Our Extremely low bit quantization technology enables to perform matrix operations with simple logic operations, thus providing high arithmetic efficiency in a small circuit area. Since the DRAM memory transfer data during inference is 1 or 2 bits, the data size is equivalent to that of the data compression, reducing memory usage and bandwidth. By using Extremely low bit quantization technology, it is now possible to improve power efficiency, arithmetic efficiency, and area efficiency while minimizing the accuracy degradation caused by quantization.
Matsuda, Soichi, Chief Executive Officer of LeapMind, says “As one of the few companies in the world that specialize in research and development of quantization technology as a method of AI technology and in the design of semiconductors for AI accelerators developed, we have been engaged in research and development for many years through joint development with major companies with the aim of achieving the practical application of AI. The balancing of high power efficiency and AI task processing performance with AI accelerators is essential for the implementation of AI in society. And today, we have achieved a power efficiency of 107.8 TOPS/W, far exceeding that of conventional GPUs and NPUs in terms of power efficiency. I believe that we have achieved an important milestone in our efforts to bring AI to the public more broadly. We will continue to strive to solve our customers' problems and pursue research and development to contribute to the evolution of AI technology”.
Please refer to LeapMind website below for the details of Efficiera. https://leapmind.io/en/products/efficiera-ip/
Appendix *1 Estimated values are based on logic synthesis results using Cadence Genus Synthesis Solution. Conditions are as follows. ・Calculated based on IP stand-alone (Not including power of other subsystems such as DRAMs, main bus processors, etc.) ・Process: 7nm ・Clock frequency:533MHz ・Power supply voltage: Center ・Temperature: 25℃ ・MAC utilization: 75.8% *2 TOPS/W: Processing performance per watt(TOPS: tera operations per second)This indicates power efficiency. The higher this number, the greater the amount of computation to be processed with low-power consumption. *3 https://openai.com/research/ai-and-compute
About LeapMind Inc.
Founded in 2012, LeapMind has been providing solutions mainly to the consumer electronics, automotive, and manufacturing industries, as one of the few companies in the world that provide “Extremely low bit quantization” technology to downsize AI and proprietary AI accelerator semiconductor design, under the corporate mission of "To create innovative devices with machine learning and make them available everywhere.” In order to provide foundational technologies to enable the next generation of information devices, we are committed to the development of both the software and hardware necessary to create such devices.LeapMind's business is highly acclaimed and has raised a cumulative total of 4.99 billion yen in funding (as of June 2023). For more information, please visit corporate website (https://leapmind.io/en/)
|
Related News
- LeapMind's "Efficiera" Ultra-low Power AI Inference Accelerator IP Was Verified RTL Design for ASIC/ASSP Conversion
- LeapMind Unveils "Efficiera", the New Ultra Low Power AI Inference Accelerator IP
- Gyrfalcon's New Chip Raises Bar (12.6 TOPS/W) on High Performance Edge AI with Lower Power Use
- Cadence Launches New Tensilica DNA 100 Processor IP Delivering Industry-Leading Performance and Power Efficiency for On-Device AI Applications
- LeapMind announces license agreement for AI accelerator "Efficiera" with Maxell Frontier
Breaking News
- Q2 2024 Global Semiconductor Equipment Billings Increased 4% Year-Over-Year, SEMI Reports
- CAST Ships I2C/SPI Controller IP Core for Easier Serial Communication
- GUC Monthly Sales Report - August 2024
- UMC Reports Sales for August 2024
- CoreHW and Presto Engineering Announce Ground-breaking Collaboration to Advance Global Penetration of Ultra-low-power RF IoT Devices
Most Popular
- Blue Cheetah Collaborates with LG to Demonstrate Successful Silicon Bring-Up of Chiplet-Based Design
- SiliconAuto adopts Siemens' PAVE360 to accelerate pre-silicon ADAS SoC development
- Intel considers foundry split, fab cancellations
- CoreHW and Presto Engineering Announce Ground-breaking Collaboration to Advance Global Penetration of Ultra-low-power RF IoT Devices
- Global Semiconductor Sales Increase 18.7% Year-to-Year in July
E-mail This Article | Printer-Friendly Page |