Hacker News · Feb 14, 2026 · Collected from RSS
Article URL: https://blog.yossarian.net/2020/11/30/How-many-registers-does-an-x86-64-cpu-have Comments URL: https://news.ycombinator.com/item?id=47014405 Points: 4 # Comments: 1
ENOSUCHBLOG Programming, philosophy, pedaling. Home Tags Series Favorites Archive Main Site TILs Nov 30, 2020 Tags: programming, x86 This post is at least a year old. x86 is back in the general programmer discourse, in part thanks to Apple’s M1 and Rosetta 2. As such, I figured I’d do yet another x86-64 post. Just like the last one, I’m going to cover a facet of the x86-64 ISA that sets it apart as unusually complex among modern ISAs: the number and diversity of registers available. Like instruction counting, register counting on x86-64 is subject to debates over methodology. In particular, for this blog post, I’m going to lay the following ground rules: I will count sub-registers (e.g., EAX for RAX) as distinct registers. My justification: they have different instruction encodings, and both Intel and AMD optimize/pessimize particular sub-register use patterns in their microcode. I will count registers that are present on x86-64 CPUs, but that can’t be used in long mode. I won’t count registers that are only present on older x86 CPUs, like the 80386 and 80486 test registers. I won’t count microarchitectural implementation details, like shadow registers. I will count registers that aren’t directly addressable, like MSRs that can only be accessed through RDMSR. However, I won’t (or will try not to) double-count registers that have multiple access mechanisms (like RDMSR and RDTSC). I won’t count model-specific registers that fall into these categories: MSRs that are only present on niche x86 vendors (Cyrix, Via) MSRs that aren’t widely available on recent-ish x86-64 CPUs Errata: I accidentally included AVX-512 in some of the original counts below, not realizing that it hadn’t been released on any AMD CPUs. The post has been updated. MSRs that are completely undocumented (both officially and unofficially) In addition to the rules above, I’m going to use the following considerations and methodology for grouping registers together: Many sources, both official and unofficial, use “model-specific register” as an umbrella term for any non-core or non-feature-set register supplied by an x86-64 CPU. Whenever possible, I’ll try to avoid this in favor of more specific categories. Both Intel and AMD provide synonyms for registers (e.g. CR8 as the “task priority register,” or TPR). Whenever possible, I’ll try to use the more generic/category conforming name (like CR8 in the case above). In general, the individual cores of a multicore processor have independent register states. Whenever this isn’t the case, I’ll make an effort to document it. General-purpose registers The general-purpose registers (or GPRs) are the primary registers in the x86-64 register model. As their name implies, they are the only registers that are general purpose: each has a set of conventional uses1, but programmers are generally free to ignore those conventions and use them as they please2. Because x86-64 evolved from a 32-bit ISA which in turn evolved from a 16-bit ISA, each GPR has a set of subregisters that hold the lower 8, 16 and 32 bits of the full 64-bit register. As a table: 64-bit 32-bit 16-bit 8-bit (low) RAX EAX AX AL RBX EBX BX BL RCX ECX CX CL RDX EDX DX DL RSI ESI SI SIL RDI EDI DI DIL RBP EBP BP BPL RSP ESP SP SPL R8 R8D R8W R8B R9 R9D R9W R9B R10 R10D R10W R10B R11 R11D R11W R11B R12 R12D R12W R12B R13 R13D R13W R13B R14 R14D R14W R14B R15 R15D R15W R15B Some of the 16-bit subregisters are also special: the original 8086 allowed the high byte of AX, BX, CX, and DX to be accessed indepenently, so x86-64 preserves this for some encodings: 16-bit 8-bit (high) AX AH BX BH CX CH DX DH So that’s 16 full-width GPRs, fanning out to another 52 subregisters. Registers in this group: 68. Running total: 68. Special registers This is sort of an artificial category: like every ISA, x86-64 has a few “special” registers that keep things moving along. In particular: The instruction pointer, or RIP. x86-64 has 32- and 16-bit variants of RIP (EIP and IP), but I’m not going to count them as separate registers: they have identical encodings and can’t be used in the same CPU mode3. The status register, or RFLAGS. Just like RIP, RFLAGS has 32- and 16-bit counterparts (EFLAGS and FLAGS). Unlike RIP, these counterparts can be partially mixed: PUSHF and PUSHFQ are both valid in long mode, and LAHF/SAHF can operate on the bits of FLAGS on some x86-64 CPUs outside of compatiblility mode4. So I’m going to go ahead and count them. Registers in this group: 4. Running total: 72. Segment registers x86-64 has a total of 6 segment registers: CS, SS, DS, ES, FS, and GS. The operation varies with the CPU’s mode: In all modes except for long mode, each segment register holds a selector, which indexes into either the GDT or LDT. That yields a segment descriptor which, among other things, supplies the base address and extent of the segment. In long mode all but FS and GS are treated as having a base address of zero and a 64-bit extent, effectively producing a flat address space. FS and GS are retained as special cases, but no longer use the segment descriptor tables: instead, they access base addresses that are stored in the FSBASE and GSBASE model-specific registers5. More on those later. Registers in this group: 6. Running total: 78. SIMD and FP registers The x86 family has gone through several generations of SIMD and floating-point instruction groups, each of which has introduced, extended, or re-contextualized various registers: x87 MMX SSE (SSE2, SSE3, SSE4, SSE4, …) AVX (AVX2, AVX512) Let’s do them in rough order. x87 Originally a discrete coprocessor with its own instruction set and register file, the x87 instructions have been regularly baked into x86 cores themselves since the 80486. Because of its coprocessor history, x87 defines both normal registers6 (akin to GPRs) and a variety of special registers needed to control the FPU state: ST0 through ST7: 8 80-bit floating-point registers FPSW, FPCW, FPTW 7: Control, status, and tag-word registers “Data operand pointer”: I don’t know what this one does, but the Intel SDM specifies it8 Instruction pointer: the x87 state machine apparently holds its own copy of the current x87 instruction Last instruction opcode: this is apparently distinct from the x87 opcode, and has its own register Registers in this group: 14. Running total: 92. MMX MMX was Intel’s first attempt at consumer SIMD in their x86 chips, released back in 1997. For design reasons that are a complete mystery to me, the MMX registers are actually sub-registers of the x87 STn registers: each 64-bit MMn occupies the mantissa component of its corresponding STn. Consequently, x86 (and x86-64) CPUs cannot execute MMX and x87 instructions at the same time. Edit: This section incorrectly included MXCSR, which was actually introduced with SSE. Thanks to /u/Skorezore for pointing out the error. Registers in this group: 8. Running total: 100. SSE and AVX For simplicity’s sake, I’m going to wrap SSE and AVX into a single section: they use the same sub-register pattern as the GPRs and x87/MMX do, so they fit well into a single table: AVX-512 (512-bit) AVX-2 (256-bit) SSE (128-bit) ZMM0 YMM0 XMM0 ZMM1 YMM1 XMM1 ZMM2 YMM2 XMM2 ZMM3 YMM3 XMM3 ZMM4 YMM4 XMM4 ZMM5 YMM5 XMM5 ZMM6 YMM6 XMM6 ZMM7 YMM7 XMM7 ZMM8 YMM8 XMM8 ZMM9 YMM9 XMM9 ZMM10 YMM10 XMM10 ZMM11 YMM11 XMM11 ZMM12 YMM12 XMM12 ZMM13 YMM13 XMM13 ZMM14 YMM14 XMM14 ZMM15 YMM15 XMM15 ZMM16 YMM16 XMM16 ZMM17 YMM17 XMM17 ZMM18 YMM18 XMM18 ZMM19 YMM19 XMM19 ZMM20 YMM20 XMM20 ZMM21 YMM21 XMM21 ZMM22 YMM22 XMM22 ZMM23 YMM23 XMM23 ZMM24 YMM24 XMM24 ZMM25 YMM25 XMM25 ZMM26 YMM26 XMM26 ZMM27 YMM27 XMM27 ZMM28 YMM28 XMM28 ZMM29 YMM29 XMM29 ZMM30 YMM30 XMM30 ZMM31 YMM31 XMM31 In other words: the lower half of each ZMMn is YMMn, and the lower half of each YMMn is XMMn. There’s no direct way register access for just the upper half of YMMn, nor does ZMMn have direct 256- or 128-bit access for the thunks of its upper half. SSE also defines a new status register, MXCSR, that contains flags roughly parallel to the arithmetic flags in RFLAGS (along with floating-point flags in the x87 status word). SSE also introduces a load/store instruction pair for manipulating it (LDMXCSR and STMXCSR). AVX-512 also introduces eight opmask registers, k0 through k7. k0 is a special case that behaves much like the “zero” register on some RISC ISAs: it can’t be stored to, and loads from it always produce a bitmask of all ones. Errata: The table above includes AVX-512, which isn’t available on any AMD CPUs as of 2020. I’ve updated the counts below to only include SSE and AVX2-introduced registers. Registers in this group: 33. Running total: 133. Bounds registers Intel added these with MPX, which was intended to offer hardware-accelerated bounds checking. Nobody uses it, since it doesn’t work very well. But x86 is eternal and slow to fix mistakes, so we’ll probably have these registers taking up space for at least a while longer: BND0 — BND3: Individual 128-bit registers, each containing a pair of addresses for a bound. BNDCFG: Bound configuration, kernel mode. BNDCFU: Bound configuration, user mode. BNDSTATUS: Bound status, after a #BR is raised. Registers in this group: 7. Running total: 140. Debug registers These are what they sound like: registers that aid and accelerate software debuggers, like GDB. There are 6 debug registers of two types: DR0 through DR3 contain linear addresses, each of which is associated with a breakpoint condition. DR6 and DR7 are the debug status and control registers. DR6’s lower bits indicate which debug conditions were encountered (upon entering the debug exception handler), while DR7 controls which breakpoint addresses are enabled and their breakpoint conditions (e.g., when a particular address is written to). What about DR4 and DR5? For reasons that are unclear to me, they don’t (and have never) existed9. They do have encodings but are treated as DR6 and DR7, respective, or produce an #UD exception when CR4.DE[bit 3] = 1.