Without diving too deeply into the details, the x86 architecture was a commercial product in the late 1970s as an 8-bit CPU. Back then, the relationship of socket vs. core vs. CPU vs. thread was 1:1. Over time, the definitions have blurred due to advancements in chip technology. With real estate, the motto is “location, location, location.” With transistors, it is “density, density and more density!” Everyone in the industry should be aware of Moore’s Law, which asserts that the number of transistors on a microchip doubles about every two years, though the cost of the associated computer is cut in half. I’m happy to debate when we will reach the limit for Moore’s Law, but it will forever remain a profound law of technology.
I am old enough to have owned an 8086 computer and used the CPM operating system with 8” floppies! Over the years I’ve seen the misuse of terms as the 1:1 relationship moved to a 1:many relationship. Most of the confusion remains around the term “CPU” or central processing unit. The goal of this article is to cover all the terms and hopefully clear up the confusion around sockets, CPUs, cores and threads.
At the most basic level, there is a “motherboard” that can do nothing without a CPU chip with pins that are inserted into the socket. The more correct term is “CPU socket.” Most common blades run a dual-socket motherboard, but then I was in awe when testing an HP DL-980 with eight sockets! It could be a whole other article by itself, but it is important to note that a NUMA node is not related to the CPU. A NUMA node is the relationship between the CPU socket and the closest memory bank(s).
The central processing unit is the most maligned term in the industry, and I have heard it used to define sockets, cores and even threads. Ultimately, the relationship between socket and CPU remains a 1:1 relationship regardless of how the term is actually used. In some ways, I would like the term to be removed because this can also be expressed as a “socket.” At the end of the day, the most important thing to remember is that the CPU is a piece of integrated silicone and represents a 1:1 relationship with the CPU socket on the motherboard.
It wasn’t until 2005 to 2006 that Intel/AMD began releasing CPUs with multiple processing units (processors). I have intentionally not used the word “processor” until now because it represents the ability to execute a single instruction from the operating system. If there was a second most maligned IT term it would be processors! Going back to the 8080 CPU, it could handle a single byte (hence an 8-bit processor) of data at a time. The processor/core density has increased to 64-bit (hence 64/8 = 8 bytes of data). With the advent of “cores,” a single CPU with 10 cores can execute 10 simultaneous instructions. And if we ignore performance, then we could be fine interchangeably using terms like CPU, core and processor. However, CPUs have associated cores and memory, so “locality” does come into play. A set of instructions coming from an operation system that is located on different CPUs will not see the same CPU cache and a host of other locality-specific performance issues.
A thread is simply a queue for an operating system instruction. The confusion comes with the term “hyperthreading,” which has been around since the 1980s. Intel did not release an HT (hyperthread) chip until the Pentium 4 (circa 2002). Hyperthreading has been the bane of my existence due to a huge misunderstanding of the term. And vendors have made this worse with OS-level definitions of the term that leverage words like CPU/core/processor. Simply stated, enabling HT allows two threads (think queues) per core. If I have an HT enabled on a 10-core system, I have 10 cores and 20 threads. I still can only execute 10 OS instructions per cycle. But I can queue up 20 OS instructions. This provides a level of efficiency by allowing for multiple instruction queue/de queue events to occur. HT can provide upwards of 30% performance improvement by eliminating latency (improving efficiency). However, it is not performing any magic and is just getting you closer to the theoretical maximum of the number of instructions that can be executed in any given timeframe on a CPU/core.
Why accurate definitions matter
I mention everything above for one simple reason: I am a VMware Technical Account Manager. TAMs help customers get the most value out of their VMware technology investments and stay up to date with the latest in the tech world. Another level of complicity is increased as IT vendors/personnel add in terms like logical and virtual that represent abstraction/emulation of physical entities. I am on calls with vSphere customers all the time and hear statements like “My processor utilization in vCenter is showing 50% of what the OS is showing.” And hence starts my standard response of “Are we talking about hosts or workload metrics? And are you talking about cores or threads?”
Using ESXTOP as another example, the term “PCPU” is used to represent a thread (and not a physical CPU or even a core!). The term core in ESXTOP is a core (OK, so you are going to throw in the correct term!). If you do not see the PCPU values and only cores, that is a sign that HT is not enabled in the bios of the host/ESXi hypervisor. It is somewhat helpful because it lets you know the percentage of time each thread is executing on a core.
Even the configuration of a virtual machine can be confusing. When you edit the CPU properties, you can set the number of CPUs and then the sub-option is cores per socket, rather than cores per CPU. Since the upper definition is CPUs, why not make the sub definition match? When it comes to setting a CPU/core relationship within a VMware workload/virtual machine be careful using cores at all. For example, with vMotion, I can move a VM from a 6 core/CPU host to a 12 Core/CPU host and vice versa. If you are setting CPUs/cores, then it should match the underlying physical infrastructure and that may be hard if you have hosts with different densities.
As long as we’re all on the same page
The key to success is remembering the x86 architecture is based on socket-CPU-core-thread. When you are dealing with a definition, the first thing to do is determine what the definition is referencing. You also need to be on the same page with your colleagues. And don’t feel disheartened that IT has made this difficult. I am reminded of a comedy act that had me in stitches. Although I don’t remember the entire rant, I do remember “We drive on a parkway and park on a driveway!” The nomenclature becomes less important if everyone agrees on what is being referenced.