I see dead uops: thoughts on the latest Spectre paper targeting uop caches

comparch.org
6 min read
standard
Last night, a group of computer security researchers lead by Ashish Venkat (University of Virginia) published a paper titled "I See Dead µops: Leaking Secrets via Intel/AMD Micro-Op Caches…
Last night, a group of computer security researchers lead by Ashish Venkat (University of Virginia) published a paper titled "I See Dead µops: Leaking Secrets via Intel/AMD Micro-Op Caches" which they have submitted for ISCA '21 (International Symposium on Computer Architecture, a prestigious academic conference). The paper concerns a structure called the micro-op (uop) cache that is commonly used within modern microprocessors. A thorough analysis of the organization and implementation of these structures allows the research team to propose novel timing side-channel attacks similar to those of "Spectre".

Modern microprocessors are decoupled into a "front-end" that decodes the instructions as presented by the programmer (in the form of compiled machine code) and a "back-end" that actually executes instructions (quite likely in the form of a data flow model, Out-of-Order with respect to how the program was actually written, but nonetheless respecting the actual dependencies). Notice I didn't say "these instructions" a second time. What the back-end actually executes may look similar to the machine code presented at the front, but it might also look very different. Perhaps it has simply been optimized (e.g. fusing two – or more – adjacent instructions together into a more efficient one) but more often than not the backend of the machine will be executing very different instructions known as micro-ops.

A micro-op (uop) is a very simple instruction. For example, it might be an ADD instruction. It may take a couple of inputs and produce one output. In some cases (such as a simple add), the "macro" ops of the machine code written by the tooling used by a programmer map 1:1 on to the uops used by the machine. But often, a single macro op is actually decomposed into many individual uops. In RISC-like machines (Arm, RISC-V, etc.) such cracking into multiple uops is usually limited to relatively complex instructions (such as certain loads that also perform complex address…
Read full article