Comparison of CPU microarchitectures
The following is a comparison of CPU microarchitectures.
| Microarchitecture | Year | Pipeline stages | Misc | 
|---|---|---|---|
| Elbrus-8S | 2014 | VLIW, Elbrus (proprietary, closed) version 5, 64-bit | |
| AMD K5 | 1996 | 5 | Superscalar, branch prediction, speculative execution, out-of-order execution, register renaming[a] | 
| AMD K6 | 1997 | 6 | Superscalar, branch prediction, speculative execution, out-of-order execution, register renaming[b] | 
| AMD K6-III | 1999 | Branch prediction, speculative execution, out-of-order execution[1] | |
| AMD K7 | 1999 | Out-of-order execution, branch prediction, Harvard architecture | |
| AMD K8 | 2003 | 64-bit, integrated memory controller, 16 byte instruction prefetching | |
| AMD K10 | 2007 | Superscalar, out-of-order execution, 32-way set associative L3 victim cache, 32-byte instruction prefetching | |
| ARM7TDMI (-S) | 2001 | 3 | |
| ARM7EJ-S | 2001 | 5 | |
| ARM810 | 5 | static branch prediction, double-bandwidth memory | |
| ARM9TDMI | 1998 | 5 | |
| ARM1020E | 6 | ||
| XScale PXA210/PXA250 | 2002 | 7 | |
| ARM1136J(F)-S | 8 | ||
| ARM1156T2(F)-S | 9 | ||
| ARM Cortex-A5 | 8 | Multi-core, single issue, in-order | |
| ARM Cortex-A7 MPCore | 8 | Partial dual-issue, in-order, 2-way set associative level 1 instruction cache | |
| ARM Cortex-A8 | 2005 | 13 | Dual-issue, in-order, speculative execution, superscalar, 2-way pipeline decode | 
| ARM Cortex-A9 MPCore | 2007 | 8–11 | Out-of-order, speculative issue, superscalar | 
| ARM Cortex-A15 MPCore | 2010 | 15 | Multi-core (up to 16), out-of-order, speculative issue, 3-way superscalar | 
| ARM Cortex-A53 | 2012 | Partial dual-issue, in-order | |
| ARM Cortex-A55 | 2017 | 8 | in-order, speculative execution | 
| ARM Cortex-A57 | 2012 | Deeply out-of-order, wide multi-issue, 3-way superscalar | |
| ARM Cortex-A72 | 2015 | ||
| ARM Cortex-A73 | 2016 | Out-of-order superscalar | |
| ARM Cortex-A75 | 2017 | 11–13 | Out-of-order superscalar, speculative execution, register renaming, 3-way | 
| ARM Cortex-A76 | 2018 | 13 | Out-of-order superscalar, 4-way pipeline decode | 
| ARM Cortex-A77 | 2019 | 13 | Out-of-order superscalar, speculative execution, register renaming, 6-way pipeline decode, 10-issue, branch prediction, L3 cache | 
| ARM Cortex-A78 | 2020 | 14 | Out-of-order superscalar, register renaming, 4-way pipeline decode, 6 instruction per cycle, branch prediction, L3 cache | 
| ARM Cortex-A710 | 2021 | 10 | |
| ARM Cortex-X1 | 2020 | 13 | 5-wide decode out-of-order superscalar, L3 cache | 
| ARM Cortex-X2 | 2021 | 10 | |
| ARM Cortex-X3 | 2022 | 9 | |
| ARM Cortex-X4 | 2023 | 10 | |
| AVR32 AP7 | 7 | ||
| AVR32 UC3 | 3 | Harvard architecture | |
| Bobcat | 2011 | Out-of-order execution | |
| Bulldozer | 2011 | 20 | Shared multithreaded L2 cache, multithreading, multi-core, around 20 stage long pipeline, integrated memory controller, out-of-order, superscalar, up to 16 cores per chip, up to 16 MB L3 cache, Virtualization, Turbo Core, FlexFPU which uses simultaneous multithreading[2] | 
| Piledriver | 2012 | Shared multithreaded L2 cache, multithreading, multi-core, around 20 stage long pipeline, integrated memory controller, out-of-order, superscalar, up to 16 MB L2 cache, up to 16 MB L3 cache, Virtualization, FlexFPU which use simultaneous multithreading,[2] up to 16 cores per chip, up to 5 GHz clock speed, up to 220 W TDP, Turbo Core | |
| Steamroller | 2014 | Multi-core, branch prediction | |
| Excavator | 2015 | 20 | Multi-core | 
| Zen | 2017 | 19 | Multi-core, superscalar, 2-way simultaneous multithreading, 4-way decode, out-of-order execution, L3 cache | 
| Zen+ | 2018 | 19 | Multi-core, superscalar, 4-way decode, out-of-order execution, L3 cache | 
| Zen 2 | 2019 | 19 | Multi-chip module, multi-core, superscalar, 4-way decode, out-of-order execution, L3 cache | 
| Zen 3 | 2020 | 19 | Multi-chip module, multi-core, superscalar, 4-way decode, out-of-order execution, SMT, L3 cache | 
| Zen 4 | 2022 | Multi-chip module, multi-core, superscalar, L3 cache | |
| Crusoe | 2000 | In-order execution, 128-bit VLIW, integrated memory controller | |
| Efficeon | 2004 | In-order execution, 256-bit VLIW, fully integrated memory controller | |
| Cyrix Cx5x86 | 1995 | 6[3] | Branch prediction | 
| Cyrix 6x86 | 1996 | Superscalar, superpipelined, register renaming, speculative execution, out-of-order execution | |
| DLX | 5 | ||
| eSi-3200 | 5 | In-order, speculative issue | |
| eSi-3250 | 5 | In-order, speculative issue | |
| EV4 (Alpha 21064) | Superscalar | ||
| EV7 (Alpha 21364) | Superscalar design with out-of-order execution, branch prediction, 4-way simultaneous multithreading, integrated memory controller | ||
| EV8 (Alpha 21464) | Superscalar design with out-of-order execution | ||
| 65k | Ultra low power consumption, register renaming, out-of-order execution, branch prediction, multi-core, module, capable of reach higher clock | ||
| P5 (Pentium) | 1993 | 5 | Superscalar | 
| P6 (Pentium Pro) | 14 | Speculative execution, register renaming, superscalar design with out-of-order execution | |
| P6 (Pentium II) | 14[4] | Branch prediction | |
| P6 (Pentium III) | 1995 | 14[4] | |
| Intel Itanium "Merced" | 2001 | Single core, L3 cache | |
| Intel Itanium 2 "McKinley" | 2002 | 11[5] | Speculative execution, branch prediction, register renaming, 30 execution units, multithreading, multi-core, coarse-grained multithreading, 2-way simultaneous multithreading, Dual-domain multithreading, Turbo Boost, Virtualization, VLIW, RAS with Advanced Machine Check Architecture, Instruction Replay technology, Cache Safe technology, Enhanced SpeedStep technology | 
| Intel NetBurst (Willamette) | 2000 | 20 | 2-way simultaneous multithreading (Hyper-threading), Rapid Execution Engine, Execution Trace Cache, quad-pumped Front-Side Bus, Hyper-pipelined Technology, superscalar, out-of order | 
| NetBurst (Northwood) | 2002 | 20 | 2-way simultaneous multithreading | 
| NetBurst (Prescott) | 2004 | 31 | 2-way simultaneous multithreading | 
| NetBurst (Cedar Mill) | 2006 | 31 | 2-way simultaneous multithreading | 
| Intel Core | 2006 | 12 | Multi-core, out-of-order, 4-way superscalar | 
| Intel Atom | 16 | 2-way simultaneous multithreading, in-order, no instruction reordering, speculative execution, or register renaming | |
| Intel Atom Oak Trail | 2-way simultaneous multithreading, in-order, burst mode, 512 KB L2 cache | ||
| Intel Atom Bonnell | 2008 | SMT | |
| Intel Atom Silvermont | 2013 | Out-of-order execution | |
| Intel Atom Goldmont | 2016 | Multi-core, out-of-order execution, 3-wide superscalar pipeline, L2 cache | |
| Intel Atom Goldmont Plus | 2017 | Multi-core | |
| Intel Atom Tremont | 2019 | Multi-core, superscalar, out-of-order execution, speculative execution, register renaming | |
| Intel Atom Gracemont | 2021 | Multi-core, superscalar, out-of-order execution, speculative execution, register renaming | |
| Intel Atom Crestmont | 2023 | Multi-core | |
| Intel Atom Skymont | 2024 | Multi-core | |
| Nehalem | 2008 | 14 | 2-way simultaneous multithreading, out-of-order, 6-way superscalar, integrated memory controller, L1/L2/L3 cache, Turbo Boost | 
| Sandy Bridge | 2011 | 14 | 2-way simultaneous multithreading, multi-core, on-die graphics and PCIe controller, system agent with integrated memory and display controller, ring interconnect, L1/L2/L3 cache, micro-op cache, 2 threads per core, Turbo Boost, | 
| Intel Haswell | 2013 | 14–19 | SoC design, multi-core, multithreading, 2-way simultaneous multithreading, hardware-based transactional memory (in selected models), L4 cache (in GT3 models), Turbo Boost, out-of-order execution, superscalar, up to 8 MB L3 cache (mainstream), up to 20 MB L3 cache (Extreme) | 
| Broadwell | 2014 | 14–19 | Multi-core, multithreading | 
| Skylake | 2015 | 14–19 | Multi-core, L4 cache on certain Skylake-R, Skylake-U and Skylake-Y models. On-package PCH on U, Y, m3, m5 and m7 models. 5 wide superscalar/5 issues. | 
| Kaby Lake | 2016 | 14–19 | Multi-core, L4 cache on certain low and ultra low power models (Kaby Lake-U and Kaby Lake-Y), | 
| Intel Sunny Cove | 2019 | 14–20 | Multicore, 2-way multithreading, massive OoOE engine, 5 wide superscalar/5 issue. | 
| Intel Cypress Cove | 2021 | 12 - 14 | multicore, 5 wide superscalar/6 issues, massive OoOE engine, big core design. | 
| Intel Willow Cove | 2020 | Multicore, SMT | |
| Intel Golden Cove | 2021 | 12 - 14 | Multicore, SMT, 6 wide superscaler, massive OoOE engine, big core | 
| Intel Redwood Cove | 2023 | Multicore, SMT | |
| Intel Lion Cove | 2024 | 12 | Multicore, without SMT, 8 wide decoder, big core. | 
| Intel Xeon Phi 7120x | 2013 | 7-stage integer, 6-stage vector | Multi-core, multithreading, 4 hardware-based simultaneous threads per core which can't be disabled unlike regular HyperThreading, Time-multiplexed multithreading, 61 cores per chip, 244 threads per chip, 30.5 MB L2 cache, 300 W TDP, Turbo Boost, in-order dual-issue pipelines, coprocessor, Floating-point accelerator, 512-bit wide Vector-FPU | 
| LatticeMico32 | 2006 | 6 | Harvard architecture | 
| Nvidia Denver | 2014 | Multicore, superscalar, 2-way decode, L2 | |
| Nvidia Carmel | 2018 | Multicore, 10-way superscalar, L3 | |
| POWER1 | 1990 | Superscalar, out-of-order execution | |
| POWER3 | 1998 | Superscalar, out-of-order execution | |
| POWER4 | 2001 | Superscalar, speculative execution, out-of-order execution | |
| POWER5 | 2004 | 2-way simultaneous multithreading, out-of-order execution, integrated memory controller | |
| IBM POWER6 | 2007 | 2-way simultaneous multithreading, in-order execution, up to 5 GHz | |
| IBM POWER7+ | Multi-core, multithreading, out-of-order, superscalar, 4 intelligent simultaneous threads per core, 12 execution units per core, 8 cores per chip, 80 MB L3 cache, true hardware entropy generator, hardware-assisted cryptographic acceleration, fixed-point unit, decimal fixed-point unit, Turbo Core, decimal floating-point unit | ||
| IBM POWER8 | 2013 | 15–23 | Superscalar, L4 cache | 
| IBM POWER9 | 2017 | 12–16 | Superscalar, out-of-order execution, L4 cache | 
| IBM Power10 | 2021 | Superscalar | |
| IBM Cell | 2006 | Multi-core, multithreading, 2-way simultaneous multithreading (PPE), Power Processor Element, Synergistic Processing Elements, Element Interconnect Bus, in-order execution | |
| IBM Cyclops64 | Multi-core, multithreading, 2 threads per core, in-order | ||
| IBM zEnterprise zEC12 | 2012 | 15/16/17 | Multi-core, 6 cores per chip, up to 5.5 GHz, superscalar, out-of-order, 48 MB L3 cache, 384 MB shared L4 cache | 
| IBM A2 | 15 | multicore, 4-way simultaneous multithreaded | |
| PowerPC 401 | 1996 | 3 | |
| PowerPC 405 | 1998 | 5 | |
| PowerPC 440 | 1999 | 7 | |
| PowerPC 470 | 2009 | 9 | Symmetric multiprocessing (SMP) | 
| PowerPC e300 | 4 | Superscalar, branch prediction | |
| PowerPC e500 | Dual 7 stage | Multi-core | |
| PowerPC e600 | 3-issue 7 stage | Superscalar out-of-order execution, branch prediction | |
| PowerPC e5500 | 2010 | 4-issue 7 stage | Out-of-order, multi-core | 
| PowerPC e6500 | 2012 | Multi-core | |
| PowerPC 603 | 4 | 5 execution units, branch prediction, no SMP | |
| PowerPC 603q | 1996 | 5 | In-order | 
| PowerPC 604 | 1994 | 6 | Superscalar, out-of-order execution, 6 execution units, SMP support | 
| PowerPC 620 | 1997 | 5 | Out-of-order execution, SMP support | 
| PWRficient PA6T | 2007 | Superscalar, out-of-order execution, 6 execution units | |
| R4000 | 1991 | 8 | Scalar | 
| StrongARM SA-110 | 1996 | 5 | Scalar, in-order | 
| SuperH SH2 | 5 | ||
| SuperH SH2A | 2006 | 5 | Superscalar, Harvard architecture | 
| SPARC | Superscalar | ||
| hyperSPARC | 1993 | Superscalar | |
| SuperSPARC | 1992 | Superscalar, in-order | |
| SPARC64 VI/VII/VII+ | 2007 | Superscalar, out-of-order[6] | |
| UltraSPARC | 1995 | 9 | |
| UltraSPARC T1 | 2005 | 6 | Open source, multithreading, multi-core, 4 threads per core, scalar, in-order, integrated memory controller, 1 FPU | 
| UltraSPARC T2 | 2007 | 8 | Open source, multithreading, multi-core, 8 threads per core | 
| SPARC T3 | 2010 | 8 | Multithreading, multi-core, 8 threads per core, SMP, 16 cores per chip, 2 MB L3 cache, in-order, hardware random number generator | 
| Oracle SPARC T4 | 2011 | 16 | Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously, 2-way simultaneous multithreading, SMP, 8 cores per chip, out-of-order, 4 MB L3 cache, out-of order, Hardware random number generator | 
| Oracle Corporation SPARC T5 | 2013 | 16 | Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously, 2-way simultaneous multithreading, 16 cores per chip, out-of-order, 16-way associative shared 8 MB L3 cache, hardware-assisted cryptographic acceleration, stream-processing unit, out-of order execution, RAS features, 16 cryptography units per chip, hardware random number generator | 
| Oracle SPARC M5 | 16 | Multithreading, multi-core, 8 fine-grained threads per core of which 2 can be executed simultaneously, 2-way simultaneous multithreading, 6 cores per chip, out-of-order, 48 MB L3 cache, out-of order execution, RAS features, stream-processing unit, hardware-assisted cryptographic acceleration, 6 cryptography units per chip, Hardware random number generator | |
| Fujitsu SPARC64 X | Multithreading, multi-core, 2-way simultaneous multithreading, 16 cores per chip, out-of order, 24 MB L2 cache, out-of order, RAS features | ||
| Imagination Technologies MIPS Warrior | |||
| VIA C7 | 2005 | In-order execution | |
| VIA Nano (Isaiah) | 2008 | Superscalar out-of-order execution, branch prediction, 7 execution units | |
| WinChip | 1997 | 4 | In-order execution | 
See also
Notes
- ^ According to AMDs K5 data sheet. The design incorporates many ideas and functional parts from AMDs Am29000 32-bit RISC microprocessor design.
- ^ According to AMDs K6 data sheet. The design is based on NexGen's Nx686 and therefore not a direct successor to the K5.
References
- ^ "Products We Design". amd.com. Retrieved 19 January 2014.
- ^ a b "wp-content/uploads/2013/07/AMD-Steamroller-vs-Bulldozer". cdn3.wccftech.com. Archived from the original on 17 October 2013. Retrieved 19 January 2014.
- ^ Kozierok, Charles M. (17 April 2001). "Cyrix 5x86 ("M1sc")". The PC Guide. Archived from the original on 2019-02-06. Retrieved 19 January 2014.
- ^ a b "Computer Science 246: Computer Architecture" (PDF). Harvard University. Archived from the original (PDF) on 24 December 2013. Retrieved 23 December 2013. P6 pipeline 
- ^ Intel Itanium 2 Processor Hardware Developer's Manual. p. 14. http://www.intel.com/design/itanium2/manuals/25110901.pdf (2002) Retrieved 28 November 2011
- ^ "Multi Core Processor SPARC64 Series : Fujitsu Global". fujitsu.com. Retrieved 19 January 2014.