By Yuriy Sheynin, Elena Suvorova,St. Petersburg University of Aerospace Instrumentation.St. Petersburg, RussiaAbstract
One of ways for enlargement of ASIC based Systems-on-chip field of application is using of internal interconnection system based on reconfigurable Network-on-chip. In this article we suggest some variants of reconfigurable system-on-chip structure based on physical and virtual channels, evaluate their parameters. We suggest mathematical model for relative hardware cost of systems evaluation We compare hardware cost and throughput for these variants of systems.INTRODUCTION
Modern Networks-on-chip that include dozens of nodes is implemented typically in ASIC. The manufacturing of ASIC is oriented to big series. Therefore typically ASIC based networks-on-chip must be oriented to a wide class of tasks and applications. Thus Network-on-chip is typically used for execution of a variety of applications with different characteristics of data flows, .
The network topology choice, routing algorithms and mapping of processes onto nodes are important factors, which determine system performance and power consumption . For different tasks different topologies allow to reach good performance and optimal power consumption. However fixed specific topology of a Networks-on-chip restricts its fields of application and reduces its production scale . If a system is intended for a wide class of tasks then typically it is based on regular topology. But its adaptation the to the tasks results in performance decrease and power consumption increase. Networks-on-chip with irregular topology are usually well optimized for narrow class of tasks, for which they give high performance and optimal power consumption. Therefore the problem of development of configurable network-on-chip with potentially regular layout that could be configured as irregular structure in correspondence with a set of tasks in the system is very actual. The ability to reconfigure of network structure could be used for increasing NoC defects and faults tolerance also.
For choice of interconnection graph for networks-on-chip interconnection line length constraints are very important. To solve the problem in EDA tools long interconnection lines are placed typically in top metallization layers, where objects could be wide and thick. Repeaters are includes in these lines. But in this case the long wires often are sources of noise and a big part of chip area could be spent for placement of long wires.
Long wires could be avoided at the stage of RTL model design by including buffers in these lines. It allows avoiding increase of thickness and width of lines. They don’t induce noise also, but hardware cost of buffers could be essential. If a system topology includes only short interconnection lines the hundreds of lines could be used for interconnection between pairs of nodes. This feature could be used for reconfigurable systems development .
Modern technologies allow using elements of FPGA technology in ASICs for adaptability. In particular reprogrammable logic could be used in interconnections between nodes. In this case the interconnection system could be dynamically configured in correspondence with current tasks in the system. Its parts could be independently reconfigured also; it will be important if in the system run several tasks with different data flow schemes.
Communication systems of networks-on-chip developed in framework of many different projects are based on two types of switches: routers (routing switches) and channel switches (elementary switches, switches with reprogrammable logic) [3, 4, 5]. Routing switches use packet routing. Channel switches provide commutation of channels. Channel switches include buffers; it allows avoiding long interconnection lines [1, 3]. Packet latency of channel switches is less than latency of routing switches. In some cases and applications the packet latency ratio for these switch types is very essential [1, 5].
In the article we consider configuration abilities of channel switch based structures. Number of topologies that could be realized in suggested reconfigurable structure depends on routing switch and packet ports number, the interconnection lines number and interconnection structure between channel switches. Configuration abilities could be increasing by using virtual channels.BASIC VARIANT OF RECONFIGURABLE STRUCTURE
We suggest simple basic variant of reconfigurable interconnection structure that meet strong technology constraints on wire length and on number of used metallization layers. This reconfigurable system includes terminal nodes, routing switches and channel switches. Every routing switch includes one port for interconnection with terminal node (t-port) and four ports for interconnection with neighbor routing switches and channel switches (s-ports). Every channel switch has eight ports (s-ports).
The general interconnection structure between routing switches and channel switches represented on figure 1 a. (routing switches represented by rhombs, channel switches represented by octagons). In figure 1b represented fragment of this structure on which represented terminal nodes (rectangles).Figure 1. Suggested interconnection structure
Every terminal node connects with one routing switch (via t-port). This connection is not reconfigurable. Routing switches are connected to other routing switches and to channel switches (via s-ports). More than one interconnection line is connected to every s-port of routing switches and every s-port of channel switch. Only one of lines connected to one port is used in one time moment. Which of lines is used is defined in configuration stage. This configuration and channels of channel switch configuration are determined current network interconnection graph.
The routing switches are directly connected into 2D-grid. The channel switches are used for configuration of network under others interconnection graphs. In general the set of interconnection graphs that is reachable for given system depends on number of channel switches and number of routing switches and channel switches ports, and interconnection structure between switches.
Every port of channel switch includes a buffer. This buffer could be used if need (if this channel is represented by a long net in current network configuration). Channel regime (with buffer or without buffer) is assigned on configuration stage for every channel of every switch individually.
Suggested network-on-chip could be dynamically configured. Different parts of network could be configured independently.
Let’s evaluate hardware cost of this interconnection structure. The length of wires in this system is practically equal to linear size of terminal nodes. The number of metallization layers that need for this interconnection system is two (if horizontal and vertical fragments of lines are placed in different metallization layers). The number of routing switches and the number of channel switches is equal to the number of terminal nodes in system. Every port of channel switch includes buffers on input part of its ports. These buffers are used for long wires avoiding. For this goal are enough very small buffers with size 1 – 4 data words. (The number of bits in one word is determined by width of interconnection line). Also these buffers could be used for balance of network load. In this case size of buffers depends on packet size and data flow parameters.
Let’s consider some variants of system configurations. Because number of routing switches s-ports is four this system could be configured under graphs in which the valence of vertices is not more than four.
If system should be configured as 2D-grid then network is based on direct interconnection between routing switches
Suggested system could be configured as tor. In this case the well-known method of node interleaving is used for node placement. Using of this method allow to limit the length of interconnections between nodes. In this case it is independent on number of nodes in graph. The interleaving method is illustrated by figure 2. In figure 2a is represented tor node placement based on 2D-grid and in figure 2b is represented interleaved node placement. System configuration corresponds to interleaved node placement is represented on figure 3. In this figure nodes are not shown.Figure 2. Variant of tor node placement Figure 3. Interconnection configuration for tor
On this figure interconnection channels for tor configuration are marked by black lines.
For interleaved placement the4 interconnection channels between terminal nodes includes two transit channel switches (for interleaving placement length of wires don’t depends on nodes number in system). Dependently on implementation technology parameters and geometric sizes of nodes and switches in one or both of channel switches in channel bufferisation regime can be on or off.
The suggested system could be configured as binary tree. We develop special variants of node placement for binary tree. These variants are oriented to shortening of interconnection channel length and using short channels between routing switches where it is possible. On figure 4 represented variant of node placement (127 nodes) that could be used for system configuration.Figure 4. Binary tree configuration (127 nodes)
As shown in this figure most of interconnections is based on direct interconnections between routing switches. These interconnections are represented by vertical and horizontal lines. The channels represented by diagonal lines and curves are formed by some channel switches.
When number of nodes is 127 the interconnections between nodes are going not more than 3 transit channel switches. The number of transit channel switches will increase with increasing of node number but not essentially.SYSTEMS WITCH MORE WIDELY CONFIGURABILITY STRUCTURES
modifications of basic variant that could be configured as 3D-grid, tor based on 3D-grid. (These topologies are widely used for many tasks.) Variant with additional physical channels
In this variant we include additional horizontal interconnections between channel switches and additional diagonal interconnections between channel switches and routing switches. These structure represented on figure 5Figure 5. Variant with additional physical channels
In this figure additional interconnections (in comparison with base variant,) are black. The number of diagonal additional interconnections is 2. The number of horizontal interconnections depends on number of nodes on row of 3D grid (Nr), see formula (1).
The placement of 3-D grid in this topology is represented on figure 6. Additional horizontal interconnections in topology are used for inter slice connections of interconnection graph.
Figure 6. Variant of placement 3D-grid nodes
We can configure this topology as tor based on 3-D grid. For placement of 3-D tor we use interleaving method: we interleave rows in every slice of 3-D grid as for 2-D tor, and then we interleave slices as we interleave columns in 2D-grid. Slice interleaving is presented on figure 7.
Figure 7. Variant of placement 3D-grid based tor nodes
On this figure dotted lines represented interconnections between slicesVariant with additional virtual channels.
Let’s consider variant, based on virtual channels, that has same configurability as variant with additional physical channels. In this variant we include virtual channels mechanism in two s-ports of channel switch that connected them with routing switches and to left and right s-ports of channel switch (in horizontal inter channel switch interconnection). The number of virtual channels (Nvch) determines allowable maximal value of Nr (Nvch=Nr-1). In this structure we use virtual channels mechanism in same directions where in previous variant we add physical channels. Also configurability of these variants is identical.PARAMETERS EVALUATION
Lets evaluate relative hardware cost of physical channel and virtual channel based system as function of allowable Nr for 3D-grid configuration. Now the area of interconnection lines is in tens times less than area of buffers and multiplexers therefore we not include it in our evaluation.
For physical channel based variant the number of ports in channel switch (Ncp) is
Where Nr – number of nodes in one row of 3-D grid
For tor based on 3D-grid:
When we add ports to switches hardware cost grows proportionally the number of ports. We could evaluate hardware cost (the size of switch area) of channel switch by next formula
- Ncli – the number of lines connected to i port of switch.
- Zb – the area of one buffer
- Amux – coefficient of multiplexers area
If we create virtual channels in system, the channel switches must analyze packet header. Also packet header should include channel number. If packet length is byte aligned then one additional byte allow to have until 256 virtual channels in our system. If we use virtual channels we need additional logic for packet header processing.
If we don’t use full buffering and physical channels with additional throughput in directions where virtual links created then the throughput of virtual channels is less than throughput of physical channels proportionally virtual channels number. If for user applications this throughput is not enough, then we need additional buffers. One buffer size should be equal data transmission unit size plus additional information such as channel number.
Also we can evaluate hardware cost of channel switch by next formula
Where Nh – number of virtual channels
H – coefficient of packet header processing hardware cost
Zbi – summary size of virtual channels buffers for i physical link.
- Zbvch – size of one virtual channel buffer
- Df – size of data transmission unit (number of words)
- Zb1 – size of one word buffer
We suppose that additional information size (number of virtual channel) is equal to one word (or less, but transmissions are word aligned)
In our system virtual channels includes not in all ports of channel switch. For ports without virtual channels H=0, Zbi=Zb for system with physical channels.
The concrete values of Zb, Amux, H depends on used technology libraries. Also Zb depends on buffer size.
For different technology libraries ratio of one word buffer size and one word multiplexing elements size is from 4: 1 to 1:4. On figure 8 represented dependency between relative hardware cost (for variant when Zb(1 word):Amux=1:1) and Nr for physical channels based and virtual channels based reconfigurable systems (for virtual channels we consider variants witch data transmission unit size from 1 to 32).
Figure 8. Relation between row number and hardware cost
These graphs show that if data transmission unit size is bigger than 4 then relative hardware cost of system based on virtual channels is essentially more than relative hardware cost of system based on physical channels.
If user applications don’t require big throughput between 3D-grid slices the relative hardware cost of virtual channels based system is equal 60 – same as for physical channels variant when Nr=3.
Relative hardware cost of basic variant is 28. Relative hardware cost of additional configurability grew for system based on physical channels for Nr=16 grew in 20 times, for system based on virtual channels grew from 2 to 70 times.CONCLUSSION
In this paper we suggest reconfigurable structures that could be configured as 2D-grids, tor based on 2D-grid, binary three, 3D-grids, tor based on 3D-grids and some other interconnection graphs with nodes valence is not more than 6. We suggest mathematical model for relative hardware cost of these reconfigurable system evaluation.
We compare hardware cost of reconfigurable structure variant based on physical channels and virtual channels. Hardware cost of virtual channels based system less that physical channels based systems if user applications requires relatively small throughput between nodes lies in different slices of 3D-grid (less than throughput between nodes in one slice in Nr-1 times) or data transmission unit size is very small (until 4 data words). In other cases hardware cost of configurable system based on physical channels is less than for system based on virtual channels.REFERENCES
U. Y. Ogras, J. Hu, R. Marculescu, “ Key Research Problems in NoC Design: A Holistic Perspective”, Proc. CODES+ISSS, Jersey City, NJ, 2005
Power-Aware Mapping for Reconfigurable NoC Architectures. Mehdi Modarressi, Hamid Sarbazi-Azad. Sharif University of Technology and IPM School of Computer Science, Tehran, Iran. in ICCD 2007. 6pp
U. Y. Ogras, R. Marculescu, “ Application-Specific Network-on-Chip Architecture Customization via Long-Range Link Insertion”, in IEEE/ACM Intl. Conf. on Computer Aided Design, San Jose, 2005
J. Duato, S. Yalamanchili, and N. Lionel, Interconnection Networks: An Engineering Approach, 2nd edition, Morgan Kaufmann Publishers, 2002
Srinivasan et al.. ”A Technique for Low Energy Mapping and Routing in Network-on-Chip Architectures”. In ISLPED, 2005.
Sheynin Y.E., Suvorova E.A. Placement of different type nodes in a Network-on-chip graph. IP-SOC 2007 (IP Based SoC Design Conference & Exibition Dec. 5-6, 2007 France)
Methods of selection of structural and architectural organization of multicast switches. Yuriy Sheynin, Elena Suvorova. IP-SOC 2006 (IP Based SoC Design Conference & Exibition Dec. 7-8, 2006 France)