SAN JOSE -- Countering claims made recently by an industry microprocessor research firm, Intel Corp. at this week's Intel Developer Forum here said the upcoming Pentium 4 has no deep pipeline performance penalty.
Intel executives here at IDF detailed the Pentium 4's NetBursttechnology, which they said significantly increases performance over other processors, while nearly doubling the number of processor pipeline stages.
Jeff Austin, Intel's IA-32 architect launch manager, said the Pentium 4's 20-stage pipeline suffers no penalty for pre-fetch misprediction because of its use of the NetBurst technology. Misprediction, which sounds like an arcane technical question, is a key performance factor. To increase the speed of operations and data rates, modern processors literally try to guess in advance what data will be needed. If the processor guesses wrong, a deep 20-stage pipeline such as Pentium 4 can take up to 13 clock cycles to purge all the da ta and be refilled, slowing operations.
Bert McComas, an analyst at InQuest Research Inc. in Gilbert, Ariz., claimed recently that the pre-fetch misprediction problem causes the 1.4-GHz Pentium 4 to operate at the same performance level as the 1.13-GHz Pentium III.
Intel's Austin, however, said NetBurst corrects most of the miprediction problem, with the Pentium 4 performing at the highest level of any Intel processor to date. Allowing the deep Pentium 4 pipeline to meet performance targets is only one of NetBurst's goals, as the device also aims to provide much faster integer and floating-point-instruction operations.
NetBurst includes Advanced Dynamic Execution, a speculative engine that helps increase memory pre-fetch prediction rates greatly, according to Intel. The technique uses three times as many instructions operating in pre-fetch as the Pentium III and includes more sophisticated algorithms that look at many prior executions before making a prediction on data to be accessed, Austin s aid.
The Pentium 4 also features a Level 1 on-chip cache that executes already decoded instructions, thus eliminating latency delays. The L1 cache of the Pentium III, in comparison, must decode instructions each time they are issued, slowing the speed at which data is fed to the processor.
NetBurst's Rapid Execution Engine is another feature and includes an arithmetic logic unit (ALU) integer-processor running at 2.8 GHz, which is twice the main-processor clock speed and provides extremely rapid processing of integer instructions, according to Austin.
A new Streaming SIMD-2 Extension in NetBurst also speeds processing by operating arithmetic integer operations at 128 bits every clock cycle, twice as fast as Penitum III. Additionally, Intel said, the NetBurst adds a 128-bit double precision float point operation not found in the Pentium III.