By Michael Gilroy, Institute for System Level IntegrationLivingston UKJames Irvine, University of StrathclydeGlasgow UKGideon Riddell, A2E LimitedLivingston UK
As storage requirements and magnetic disk densities increase the need for reliable storage solutions also increase. This IP core, written in Verilog HDL, provides a small and efficient hardware accelerator for performing RAID 6 calculations to provide uninterrupted access to data during both single and double disk failures.
In this paper we describe the implementation and verification of a RAID 6 IP block. We present an example system implemented on an FPGA to demonstrate the capabilities of the IP block and verify its operation in hardware.
The digital storage requirements for both consumer and high end systems continue to increase rapidly year on year. This is being driven by the widespread adoption of digital TV, photography music etc., plus increased legislation on business to retain data over longer time periods.
Coupled with this increase in storage requirements is the need for ensuring availability and reliability of the data and delivering this at as low a cost as possible. Redundant arrays of independent disks (RAID) have, since the 1980s, provided the means to store and recover lost data efficiently . Of the original RAID levels, RAID 5 provided the optimal solution based upon the cost and reliability for systems with 3 or more hard disk drives.
The performance benefits offered by RAID 5 based solutions are slowly being eroded by the increased risk of data loss due to multiple simultaneous disk errors, be they unrecoverable read errors or disk drive failures (erasures).
In this paper we discuss the limitations of RAID 5, the benefits of adopting RAID 6, present a RAID 6 IP block the verification process, and an example RAID 6 based application.
RAID allows the combination of multiple hard disk drives to provide a combination of one or more of the following characteristics:
- Protection against data loss.
- Provision of real-time data recovery with uninterrupted data access during both drive failure or data recovery.
- Increased system uptime and network availability.
- Multiple drives working in parallel which increases system performance.
Various RAID levels were proposed by Paterson et al in their 1981 paper  describing RAID architectures. Of these, RAID 5 offered the best performance in terms of reliability when compared with the overall system cost, data availability, redundancy overhead and data throughput.
RAID 5 utilises a single redundant disk which contains parity data to allow data recovery for and single disk failure or read error. The parity calculation may be readily implemented in software or in via a simple field of XOR gates in hardware.
Fig. 1. RAID 5 implementation. Data and associated parity are striped across the drives. This is the smallest RAID 5 array type with the data and parity locations being rotated on each stripe
The need for greater redundancy in array based systems was originally limited by the purchase cost and the low probability of simultaneous multiple disk failures. However, as disk capacities increase and disk arrays increase in density, the likelihood of an unrecoverable read error or multiple simultaneous disk failures occurring increase.
The mean time to failure (MTTF) of disk drives has improved rapidly with less than 1% of disk drives failing a year . However, the probability of an unrecoverable read error occurring whilst restoring a disk array has increased due to the storage capacities now available from hard disk drives.
Fig. 2. Shows the probability of a read error occuring during an array rebuilding a single lost disk. Disks have a bit error rate for read accesses of 1 in 1014
Disk arrays using large numbers of disks, or a small number of large disk drives increases the probability of errors occurring in a RAID 5 disk array. The use of double disk redundancy, RAID 6, reduces the risks of this situation occurring.
RAID 6 whilst not defined in the original RAID definitions has been loosely defined as offering support for the recovery of any 2 disk erasures. A number of algorithms proposed for the implementations of RAID 6 including, EVEN-ODD encoding  and Reed-Solomon (RS) coding .
Fig. 3. RAID 6 RS based design. Shows the smallest array size with two data disks and two checksum disks.
Our solution utilises a Reed-Solomon based encoding scheme over a Galois field of 16, which necessitates the use of two redundant disk drives . The use of Reed- Solomon codes instead of one of the other algorithms was determined by the ease of implementation and the similarities between this algorithm and RAID 5. RAID 5 may be considered special case of Reed-Solomon with a redundancy of 1, and therefore the RAID 6 implementation may make use of data striping and other RAID 5 type optimisations for updating stripes.
THE RAID 6 IP BLOCK
Our RAID 6 IP block is based upon a pipelined Reed- Solomon encoder and decoder with a controlling state machine and local memory.
Reed-Solomon coding is performed over a Finite or Galois field. Galois field arithmetic is well suited to hardware implementation as the results of all multiplications and division are guaranteed to be real numbers. The use of Galois field arithmetic makes the algorithm time consuming to implement in software as the algorithm calls for a number of operations to be performed per input data block.
All Galois field addition and subtraction is performed by an XOR operation. Multiplication and division may be performed utilising the logarithms and anti-logarithms. These are readily implemented as lookup tables in hardware.
The RAID 6 IP block does not perform error detection on the incoming data stream. Data errors and disk failures are instead indicated by the CRC checks performed by the hard disk drives on all read blocks.
Fig. 4. RAID 6 IP block interconnection. The interconnect bus may be a direct connection to an onchip device or a Avalon or other bus provided a suitable wrapper is included
Data is read into the RAID 6 IP block from one disk at a time until the checksum memory is filled all, or all the data has been read. The checksum memory stores the temporary results of the Reed-Solomon calculations. These are fed back as each new disk is read. Once the final disk has been read the calculated checksum(s) may be output to the appropriate location.
The Reed-Solomon coding scheme which has been utilised supports up to 16 disks and requires that data be encoded or decoded 4-bits at a time. By implementing multiple encoder/decoder blocks in parallel various data bus widths may be supported. Our test system verified the operation of both 32-bit and 64-bit wide data paths.
The IP block has been implemented in two modes. The first provided direct connection to a PCI bus IP block. This allowed the design to be verified as a RAID 6 accelerator. The second provided an Avalon bus wrapper allowing the design to be readily implemented in and added to systems based upon the Avalon bus specification.
Verification of the IP block was performed by both simulation and implementation on an FPGA and testing using standard software tools. Simulation of the RAID 6 RTL via a Verilog simulation allowed verification of correct operation of the encoder for all valid combinations of disks arrays from 4 to 16 disks. Simulation also showed proper data recover for all combinations of single and double disk failures.
Synthesis of the design on to a number of Altera FPGAs showed that the RAID 6 IP block was capable of operating at speeds of up to 300MHz using 32-bit and 64-bit wide data paths. Ensuring data availability was found to be the limiting factor to the hardware design.
Verification of the checksum generation and data regeneration was performed with a simple test system connecting the RAID 6 IP block to DDR SDRAM and a NIOS II processor via the Avalon bus. Using this embedded system we were able to generate test data in software with which to verify the correct functionality of the hardware. Generation of small RAID 6 arrays and regeneration of failed disk drives were shown to function correctly
Fig. 5. Simple embedded system to allow verification of RAID 6 IP block. Also provides the basis for a complete hardware RAID 6 controller
To verify the hardware implementation provided a speed up over software based RAID 6 algorithms the RAID 6 IP block was synthesised onto a PCI based FPGA development platform. This configuration allowed the RAID 6 hardware accelerator to be accessed directly over the PCI bus. This allowed direct comparison of the accelerated hardware RAID 6 with software based RAID algorithms.
Our test platform consisted of: a server class computer running Linux kernel 126.96.36.199 on an AMD Opteron processor with 1 GB of RAM; 1 parallel ATA drive for the operating system; and 4 Serial ATA drives with the RAID 6 array. Testing of the RAID 6 IP block was carried out on an Altera Stratix EP1S25 based PCI development card connected to the test platform. The RAID 6 IP block was configured to perform its data accesses via the PCI bus accessing data from main memory.
The standard Linux RAID 6 software driver was modified to support the use of the hardware accelerator on the PCI bus. All software based calculations of the Reed-Solomon coding were replaced with calls to the hardware.
Fig. 6. PCI RAID 6 accelerator based upon the RAID 6 IP block and implemented on an Altera Stratix PCI development board
RESULTS & ANALYSIS
Our hardware test platform allowed the hardware system to be tested and compared with existing software only solutions. Standard Linux based benchmarking tools were utilised to determine the data throughput and CPU utilisation. BONNIE++, a common benchmarking tool showed that the hardware performed favourably when compared to the software only implementation and it was noted that there was a significant drop in CPU utilisation, between 33% and 50%, when using the hardware .
The overall performance of the hardware RAID 6 accelerator was limited by both the PCI bus throughput and the lack of a direct connection to the hard disk drives. Even with these limitation data throughput of the hardware accelerator for read and write accesses matched those of the software RAID 6 algorithm and improved as drives were failed and recovered.
RAID based storage solutions provide the means to provide low cost, reliable data storage systems. In this paper we have demonstrated an implementation of a RAID 6 IP block to provide superior data reliability, implemented and tested the IP in an FPGA and shown that this hardware solution provides performance benefits over software only implementations even at low data throughput rates.
Additionally, the IP core has also been shown to be readily implemented in Avalon bus based systems, allowing rapid development of RAID 6 accelerated systems on Altera FPGAs.
Our thanks go to both A2E Limited and EPSRC for providing funding for this research work.
 Bray, T. BONNIE user manual
 N. Brown
 M Blaum et al. EVENODD: An Efficient Scheme for Tolerating Double Disk Failures in RAID Architectures, IEEE Transactions on Computers, Vol 44, No 2, PP 192 -202, February 1995.
 IO Zone
 M.H. Jing et al. A fast error and erasure correction algorithm for a simple RS-RAID, 2001, pp333-338
 Karp, M: All about RAID, Network World Newsletter, 7th July 2005
 Paul Luse, Mark Schmisseur, Understanding Intelligent RAID 6, Technology@Intel magazine, May 2005
 Maxtor Corporation, Atlas 15K II SAS datasheet, 2005 Maxtor Corporation.
 Maxtor Corporation, DiamondMax 10 datasheet, 2005 Maxtor Corporation.
 D.A. Patterson, G.A. Gibson, R.H. Katz, A Case for Redundant Arrays of Inexpensive Disks (RAID), Proceedings, ACM SIGMOD Conference, pp. 109116, July 1988.
 I. S. Reed and G. Solomon. Polynomial codes over certain finite fields. J. SIAM, 8(2):300-304, June 1960
 Wicker S., Bhagarv, V., Reed-Solomon codes and their application, IEEE press, 1994