AMD recently held a special technical meeting for the first time disclosed the Zen CPU architecture and public display under the same frequency 8 core 16 threads can draw Intel Core i7-6900K. However, when they talk about the architecture of some about the situation, and now at Hot Chips 2016 conference, AMD has released the first details of many Zen architecture, a detailed explanation of the 40% improvement is how come. First, the so-called 40% of the upgrade do not mean the actual performance, but the number of instruction cycles (IPC) per clock indicator changes to this theory, the object contrast is the current Excavator Excavator architecture.
This is Zen architecture specific improvements in performance, in terms of power consumption, the core engine wherein the improvement comprises: logic supports two threads per core, branch misprediction improved, better branch prediction, a larger cache operation, wider micro instruction dispatch, larger integer / floating point instruction scheduler, larger rollback greater fallback / load / store sequence.
There is a cache system write back cache, faster secondary / tertiary cache, faster floating point unit loads, better primary / secondary data prefetching, a primary / secondary cache to enhance the bandwidth of close to 1 fold, three-level cache up to 4 times the total bandwidth upgrade.
To reduce power consumption, Zen architecture is also a lot of work, full use of low-power design, including multi-level clock gating, a write-back cache, the cache bigger operation, stack engine and so on.
Core Microarchitecture Details: Pick four x86 instructions, operating instruction cache, four integer units, two storage / loading unit (out of order load support 72), two floating-point unit (128-bit FMAC), 4 -way 64KB instruction cache, 8-way 32KB data cache, 8-way 512KB secondary cache, 8MB shared three cache.
Instruction pickup section
Load / store unit and a secondary cache
Floating Point Unit
CPU Complex (CCX): this yesterday explained before. Although each of the four core Zen architecture as a group, but in addition to the four core shared three cache than no other associations, they are completely independent of each other.
Simultaneous multithreading (SMT): All instructions only support single-threaded mode, the front of the queue prioritization, inter microinstruction queue, the queue outside the rollback, store queue most of the modules are fully shared.
New Instruction Set: ADX (extended multi-precision arithmetic), RDSEED (supplementary RDRAND random number generator), SMAP (Advanced mode access interception), SHA1 / SHA256 (hash encryption algorithm), CLFUSHOPT, XSAVEC / XSAVES / XRSTORS, CLZERO (clean the cache line), PTE Coalecing (page 4K to 32K page table merge), the last two of which are unique to AMD Zen architecture.
While continuing to support all standard instruction set: AVX, AVX-2, BMI1 / 2, AES, RDRAND, SMEP.