Design Reuse

Home Conference Program Exhibition Advisory Board Registration Practical Information D&R Seminar

Seminar: Tera-Scale Architectures
Thursday, December 4, 2008, 11:00 AM - 12:00 AM | Room: 2

Limits to serial computing due to memory, ILP and power-thermal walls lead to near-end of uni-processors with faster clock rates. The next evolution in computing is already on action : From a few to many cores processor to handle exponential grwoth of digital data. This technical seminar will focus on the design and application of multi-core processor architectures targeting tera-flops performance. Topics addressed will include : In-Network cache Coherence Protocol, NoC-based multi-cores Architectures and Software Optimization and Compilation.


Huy Nam Nguyen
Bull S.A.S. / METASymbiose S.A.S.


Huy Nam Nguyen
Bull S.A.S. / METASymbiose S.A.S.
With the increasing number of cores integrated in future processor, efficient directory-based protocols for cache coherence will be necessary. Most of current protocols targeting this problem face well-known issues of latency and scalability. A new approach to embed such protocol within a NoC, thus enabling in-transit optimization of memory access, is under investigation and will be discuss during this seminar.


Professor Alain Greiner
UPMC/Lip6 will present the design and validation of a generic, shared memory, cache coherent, multi-cores architecture to support an IN-cache coherent protocol. This work targets a truly scalable architecture based on a distributed, packet switched NoC to integrate several tens or hundreds of processor cores


Most emerging (embedded) platforms are multiprocessor systems on chips to meet the performance requirements of modern applications. This is a challenge for software developers and optimizing compilers as more resources are available for program partitions and applications. Writing threads for sequential program specifications, thread synchronization and debugging is a difficult, time-consuming and error prone task. Compiler assisted parallelization should be employed in order to relief developers from this task. A key aspect in the TSAR project and its multicore platform is data locality. The architecture has distributed caches to implement a single address space which allows efficientdata reuse. Good data cache behavior after parallelization is crucial forachieving the desired performance results as non-local access latencies are typically high. Program analysis of cache behavior and parallelization can be efficiently implemented using the polyhedral model which allows exact analysis of scalar and array references. Iteration spaces of program statements are represented as polyhedra. Array index expressions, loop bounds, and conditional expressions are affine combinations of loop iterators and program parameters. We present a case-study for a Sobel edge detection kernel and show how to find compile-time solutions for cache behavior analysis using available tools such as Omega, Polylib, and Piplib.


A Low Cost Network-on-Chip Adapted to High Performance Multiprocessor Architectures