Edge AI is transforming computing by bringing faster, more secure, and energy-efficient intelligence directly to devices, but its full potential depends on new chip architectures and training models adapted to the constraints of the edge.
www.eetimes.eu, Aug. 19, 2025 –
The market for AI-specific chips, including GPUs, accelerators, and high-speed, high-bandwidth memory (HBM) chips, is expected to soar to $150 billion by 2028. These specialized semiconductors power advanced AI applications, from generative AI models to IoT solutions. The spotlight is on developing silicon designs that offer leading-edge performance while ensuring scalability and efficiency. However, the success of AI chip performance is also dependent on where data is processed. Edge AI has the potential to be the tide that raises all boats.
The rapid proliferation of edge computing and AI has revolutionized industries—from automotive to manufacturing, healthcare, electronics, retail, and financial services—enabling smarter, faster, more secure solutions for businesses and consumers.
These rapidly growing solutions rely on the cloud to process enormous AI workloads. Despite the high costs associated with cloud generative AI, virtually unlimited memory capability and power capacity mean that cloud-based applications will continue to drive AI applications for the foreseeable future.
When computing processing for AI applications takes place primarily in the cloud, this leads to challenges with security, privacy, response times, and scalability. For example, autonomous vehicles need near-instantaneous response times for safe and effective operations, so computing resources centralized in the cloud introduce latency and negatively impact performance.
Edge AI offers the capability to unlock better performance, scalability, security, and innovation. It’s transformative because it uses AI directly within a device, computing near the data source rather than an off-site data center with cloud computing.
Edge AI’s distributed nature promotes scalability, making it easier and more cost-effective to deploy edge-optimized AI applications across numerous devices. In addition, by processing potentially sensitive data at the edge, privacy and security are enhanced, reducing the risk of data breaches. Edge computing also improves reliability, as devices can function with intermittent or no internet connectivity.
At the heart of this technological transformation lies chip design for edge AI—where the combination of science, engineering, and AI optimization creates chips capable of handling real-time data processing locally.
The desire to perform complex generative AI processing at the edge creates new challenges, including real-time processing needs, restrictive cost requirements, limited memory resources, tight space requirements, and mandatory power budgets.
Traditional chip architectures face difficulties at the edge due to the high energy cost of constantly moving data between memory and processors. AI workloads, especially large language models (LLMs), require frequent memory access, which can create bottlenecks. To mitigate this, new chip architectures integrate AI accelerators with optimized memory hierarchies, reducing reliance on external memory and enabling faster, more efficient processing. The key principle is to maximize the reuse of data once it has been loaded onto the chip.
Powerful SoC solutions are necessary so that intelligent devices can see and understand their surroundings. Heterogeneous integration in three dimensions is a logical step so that data processing, storage, and specialized AI accelerators can move even closer together on the chip. Vertical structures are also valuable for edge AI. Instead of making flat structures smaller and smaller, 3D architectures use the third dimension to build upward on the same surface area, similar to high-rise buildings. This technology has been very successful in recent years, especially for flash memory (NAND). These advancements make it possible to further increase the performance of the chips while reducing the costs and energy consumption for the same performance. AI would not be possible without these powerful, specialized chips.
These technological advancements have one thing in common. They need new “intelligent” materials that have not previously been used in chip production. The new 3D structures require a completely different layering of materials, moving from horizontal layers to vertical structures. In addition, the properties of many commonly used materials change dramatically when shrunk down further (e.g., copper does not conduct electricity well if it is just a few nanometers in size). At the same time, mechanical and thermal properties are becoming increasingly important. Today, a chip produces more heat on its surface than a stove top. Dissipating heat is more and more challenging with layered structures. Developing new materials that better fulfill these requirements is becoming increasingly important for the chip industry.
The task of new materials discovery is daunting—with options to combine dozens of potential elements into many different three-dimensional structures, the challenge seems overwhelming. But new tools that run on today’s chips could help revolutionize the chips of tomorrow.
For example, AI aids the development of new, highly specialized materials that make semiconductors faster, more efficient, and heat-resistant. It can also be leveraged to conduct experiments virtually—it’s possible to test how a material behaves at different temperatures, whether it reacts with other substances, or to what degree of purity it can be manufactured—and all before it is even mixed in a laboratory.
The current models used to train AI must be adapted to edge computing, as traditional models require computing power that edge devices cannot provide. Due to these limitations, developers are going beyond the usual approach of deep learning. One possible direction is that AI is not trained based on millions of examples in a database, but by observing human trainers. Thus, edge AI devices can increase compute efficiency and still achieve high reasoning performance.
Cost and efficiency improvements remain barriers to the adoption of edge AI to become mainstream. There’s evidence this is already underway. A recent comprehensive analysis from Epoch AI has determined the rate at which algorithms for pre-training language models have improved since 2012. They found that the level of compute needed to achieve a given level of language model performance has halved roughly every eight months. More optimistic future projections from SemiAnalysis claim 4× to 10× algorithmic improvements per year for LLMs. In any case, this is far more accelerated than Moore’s law. Likewise, costs for inference pricing show the same exponential trend, e.g., costs for GPT-3 quality have fallen around 1200× in less than three years.
Thanks to significant advances already underway, edge AI is already a reality today, with significant upside and innovations in the coming years. As AI applications are used in more and more areas, the demand for high-performance, more-efficient specialized chips embedded in edge devices will continue to grow.