ARM has launched a new flagship graphic processor core with ray tracing. The Immortalis GPU is aimed at system-on-chip designs for Android gaming phones, and mirrors the recent launch of a ray-tracing core by Imagination Technologies. The GPU block is based on a new architecture call Valhall. This uses a …
www.eenewseurope.com, Jun. 29, 2022 –
ARM has launched a new flagship graphic processor core with ray tracing.
The Immortalis GPU is aimed at system-on-chip designs for Android gaming phones, and mirrors the recent launch of a ray-tracing core by Imagination Technologies.
The GPU block is based on a new architecture call Valhall. This uses a new superscalar engine, simplified scalar instruction set and improved data structure alignment with modern APIs, particularly Vulkan.
|  | Arm Immortalis-G715 | 
|  | Arm Mali-C55 Image Signal Processor | 
The block is configurable from 10 to 16 shader cores, each with a ray tracing unit. There is also added support for Matrix Multiply instruction delivers machine learning (ML) improvements for including computational photography and image enhancements. It supports a configurable L2 cache from 512KB to 2MB and the usual AMBA4 ACE, ACE-LITE, and AXI interfaces.
All Mali GPU architectures are based on the principle of tile-based rendering to minimize the external DDR memory bandwidth needed for framebuffer access. All the primitives are vertex shaded and assigned to screen-space tiles based on pixel coverage, while fragment shading renders into an intermediate tile-buffer stored in a local memory that is tightly coupled to each shader core, and only written back to main memory at the end of rendering for that screen region. All frame buffer accesses during shading, such as depth testing or alpha blending, are energy efficient accesses made to this local memory instead of external DDR.
The use of local memory for intermediate storage enables energy efficient blending, zero bandwidth transient attachments, and almost free 4x multi-sample anti-aliasing.
Transaction Elimination (TE) is a key bandwidth saving feature which allows for significant energy savings on a System on Chip (SoC). When performing TE, the GPU compares the current frame buffer with the previously rendered frame and performs a partial update only to the particular parts of it that have been modified, thus significantly reducing the amount of data that need to be transmitted per frame to external memory. The comparison is done on a per tile basis, using a Cyclic Redundancy Check (CRC) signature to determine if the tile has been modified. Tiles with the same CRC signature are identical; therefore eliminating them has no impact to the resulting image quality. TE can be used by every application for all frame buffer formats supported by the GPU, irrespective of the frame buffer precision requirements.
The Immortalis G715 core is design to be combined with the next generation Mali-G715 core which includes variable rate shading to help reduce energy use and boost performance.
ARM has also launched new CPU cores for mobile phone SoCs alongside the Immortalis G715 and Mali-G715 in a Total Compute Solution (TCS) with tools and security IP. The single threaded X3 gives a 25% performance improvement over the X2, while the Cortex-A715 focuses on efficient performance, delivering a 20% energy efficiency gain and 5% performance increase on the previous Cortex-A710 to reach the equivalent performance of the X1 core.
ARM has gone back to its 'big.LITTLE' configurations with a mix of low energy and high performance cores, launching an updated version of its Cortex-A510 'LITTLE' core with lower power consumption.
"Smartphones are at the centre of our connected lives. From gaming to productivity, through video calling, social media or virtual environments, it is the device that provides us the connection to everyone and everything, in real time. For developers, making these immersive real-time 3D experiences even more compelling and engaging requires more performance," said Paul Williamson, senior vice president and general manager, Client Line of Business at ARM.
The combination of ARM IP will offer up to 28% more performance and up to 16% power reduction across a range of workloads, such as gaming, where benefits will include longer play time, says Williamson. "We continue to expand the dimensions of performance beyond general-purpose workloads to workloads requiring specialized processing, propelling mobile technology, not just on the GPU but across CPUs and System IP too," he said.
MediaTek, the largest supplier of chips for smartphones, is a key customer.
"Congratulations to Arm on the launch of the new Immortalis GPU, featuring hardware-based ray tracing. Combined with the new powerful Cortex-X3 CPU, we look forward to the next-level of mobile gaming and productivity for our Flagship & Premium mobile SOCs," said Dr. JC Hsu, Corporate VP and GM, Wireless Communications Business Unit at MediaTek