Intel is expected to launch new processors on Monday starting with its Pentium line that have the ability to subdivide tasks in a hardware feature called Virtualization Technology–or VT.
Intel is expected to launch new processors on Monday starting with its Pentium line that have the ability to subdivide tasks in a hardware feature called Virtualization Technology–or VT.
Here’s hoping that its not as half-assed as their SMT implementation…
Who knows if Apple selected Intel over AMD because Intel is the first one to introduce virtualization on the desktop. AMD’s hardware virtualization implementation (aka Pacifica) should arrive later in 2006. But who knows … how important is virtualization for OS vendor like Apple?
Virtualization is a neat mainframe feature. But is there any need to have it done in hardware by desktop CPUs? Beyond what VMWare or qemu (or MacOnLinux) already do rather well? While intel blurbs out funny fairytales about how it will ease driver development, a quick look at the PowerPC lets us see that this does all the tricks with virtually zero performance penalties with just two privilege levels: By running user level at full speed and generating precise exceptions for privilege violations.
No, dear friends, this kind of virtualization with an extra ring that gets strange superpowers over memory accesses in dedicated hardware aims at something else: Putting the provider of the “standard” hypervisor (or “nexus”, if you like) into control over everything else that is done with the machine. And don’t think for a minute VMWare or other non-majors could stay to play; if they could they would not have demanded open standards in that sector recently.
In the near future you will also find out how the “virtualization” nicely interacts with “trusted platform” features like encrypted memory and the on-die TPMs.
xdev, I have to doubt that you’ve ever actually used any of the virtualization software. Are you happy with the performance losses? If so, why?
Giving Intel chips better virtualization support is a great thing. It means software like VMware (and others) can run much closer to full speed.
And TPM is excellent technology, as long as it remains in the owner’s control. I would be happy to have a TPM enabled motherboard as long as it’s running a trusted (by me) operating system.
TPM will combine with virtualization. It’ll be great to run untrusted applications inside a virtual machine where they can be closely watched.
VMware and Qemu both take performance penalties for when guests run in kernel space.
VMX doesn’t actually give an extra ring, it’s the addition of root vs non-root mode: effectively 4 extra “virtual” rings, as well as the existing “real” ones.
The main point of VMX “root mode” is to be able to trap instructions in kernel-mode that would behave wrongly (and can’t be trapped) when running in user mode. This eliminates the need to do binary scanning / rewriting (VMware) or full emulation (QEmu) when running in kernel mode. This alone should improve performance.
I’m sure VMware will support these extensions as will Microsofts Hypervisor, as does Xen 3.0 (in a preliminary form). As with the TPM the issue isn’t really the extensions themselves, it’s what people decide to try and push with them; DRM would be an unfortunate use, improved corporate data security would be more positive.
> VMware and Qemu both take performance penalties for when guests run in kernel space.
Yup. This is because the x86s can’t generate the “precise” exceptions that are needed to have an elevated process emulate a whole virtual machine for a user process that _thinks_ it is elevated. Therefore both of them have to recompile code to be run as elevated.
If you look at Mac On Linux, the goal is the same. But the PowerPC generates exceptions precise enough for the elevated process to make the user process think it is elevated without recompiling. And – oh wonder – it runs at about native speed. With just a single bit that makes it tick.
So the performance deficit for x86 virtualization comes from a basic architectural flaw of the Intel CPUs. Now the easy workaround would have been to fix this bug and have a bit that enables elevated code to trap ring 3 privilege violations precisely. That would have enabled 90+% of native performance without any big effort.
Now the question arises why _did_ they put that much of work into VT – and who gains what.
The desktop user hardly gains anything. If at all, s/he gets off worse with real hardware insulation between “processes”. The first thing s/he needs is network support, and there is only one cable, and then come file copying and cut/paste between worlds. There is no use case where the simple solution wouldn’t work just right. After all, on PPC it does so. Today.
No. All this only makes sense when you look at the DRM schematics for Windows with the “legacy” part and the “media” part, connected by the Nexus. And Longhorn+1 won’t boot at all if it can’t bring its own Nexus. If you still like VMware, you commie hippies get to switch between options as lovely as FreeBSD and Haiku to your hearts content.
If you look at the attack vectors on the hardware of a “remote-owned” system (DMAing into RAM etc.), you will see that VT/LT precisely disable them one-by-one. Its complexity serves no other purpose.
It’s nice to be able to take system calls and other traps directly by transitioning into non-root ring 0; it avoids the overhead of reflecting through the hypervisor in the design you propose. There are other things in the spec to avoid other overheads a “trap everything” scheme would include, and to pass through real PCI devices to guests, amongst other things. I certainly don’t remember seeing anything that screamed “DRM specific” when I read the spec.
Are there any specific features of the VT spec that bother you? If you only had precise traps I don’t really see how this would avoid any DRM scheme being implemented.
LT is a related issue; it can be (ab)used for DRM purposes. It can also be used to improve security and robustness. My hope is that the market will not tolerate the former.
> It’s nice to be able to take system calls and other traps directly by transitioning [in hardware].
Well, on a mainframe certainly.
> Are there any specific features of the VT spec that bother you?
It’s the overall complexity. I come from a “Woz” school of thinking, where any gate that can be done in software should be. If you look at the PowerPC 603, you see this gone even too far. It has to resolve page table misses in software. This impacts _every_ program and it makes sense to sacrifice the space for the extra gates to gain that 5%. But virtualization is _such_ a fringe issue where you have to deal with complicated arbitration decisions that cause a magnitude more headache than a single-figure performance hit. That probably could be more than compensated by just adding these (core) gates to the L1 cache.
Even worse, when you look at a modern PC from the “Bunnie” side (of X-box hacking), you get a few attack vectors against the hardware: snooping on the buses, DMAing into “trusted” RAM, replacing the RAM itself. VT provides precisely those controls needed to close these holes that _NO_ remote attacker could ever exploit. Maybe tricking a driver into doing wrong DMA, but that’d be a red-alert remote root hole today as well. The disownment of the user will then be completed, when LT switches on encryption for the channels I mentioned.
> If you only had precise traps I don’t really see how this would avoid any DRM scheme being implemented.
Well. I think VirtualPC lets you take screenshots that get disabled in the client OS. The trick is that the client can’t know whether it is virtualized or not. There are no coloured pills to let it know it’s in the matrix. Man-in-the-middle access of a TPM would be easy as well, so it would happily spit out any attestation the user needs.
Now, if VT would be virtualizable _itself_ (that is generating precise interrupts for its own commands), we’d be looking at a slightly more sane situation. Then they would’ve left a gaping loophole in their system – you’d “just” have to have the keys for trusted booting. Given the intermingling with hardware, I doubt it however – do you, or does anyone know, whether VT can be meta-virtualized?
> My hope is that the market will not tolerate the former.
As the saying goes “Power is for the taking”. It will get “the former” shoved up its ass. _MY_ hope is that souls aware enough (or just those struggling about the keys) will delay it long enough for “GPL Corporation” to be a player major enough to force that idea to its rest.
I think you won’t like the AMD Pacifica spec, then – it’s even more complicated; it includes the possibility of in-hardware memory virtualisation (i.e. the hardware will eventually be able to walk the guest OS page table *and* a “guest physical to machine physical” table and service guest TLB faults directly. Erk!). Quite funky though 😉
I still think that given a TPM that allows attested-boot, any hypervisor is going to be able to do equally nasty stuff as a complicated hypervisor wrt DRM. That said, concerns regarding x86 manufacturers not producing buggy hardware are probably valid!
I’ve been thinking about how to efficiently implement nested VT / Pacifica virtualisation under Xen (the normal VT support is there already but it doesn’t do nested, basic Pacifica support on the way). We just spent quite some time in my office trying to figure out what VT would allow; the answer was “maybe” to nested operation. The spec is a bit ill-worded here – I’ll have to ask an Intel guy. Pacifica definitely will be able to support nested virtualisation. In each case, the hypervisor would need to include extra logic to emulate the extra machine state.
Of course, even without hardware support one could take the kQEmu approach and emulate ring 0 code to run VT OSes.
> I think you won’t like the AMD Pacifica spec, then (i.e. the hardware will … service guest TLB faults directly. Erk!)
Hehe. A bit of overkill methinks. That only makes sense when you plan for a future where everything runs virtualized and benchmarks start to effect buying decisions – which they mostly do in the consumer market. Assuming my pessimistic vision, it’s a genius move. On doomsday plus one they’ll have that little extra punch ready. Plus they please the nerds with nestability. Neat. Dirk Meyer is so damn smart. (Disclaimer: I own AMD shares, because I think he always was).
> I still think that given a TPM that allows attested-boot, any hypervisor is going to be able to do equally nasty stuff as a complicated hypervisor wrt DRM.
Yes. You’re right there. You’d have to have an on-die TPM and core-root-of-trust software though – in both cases (or you could nest the whole show). One could easily take the PowerPC and add encryption to the Memory interface with the same results. They’d just have to pipe I/O through the CPU. But purely political, it would be slightly less nasty, because everyone would reject it because of the (minor) performance hits. Intel is getting these out of the way (as i heard including chipset support for i/o with ram encryption).
It is not clear that servicing that stuff in hardware will improve performance much relative to just using shadow page tables as VT-enabled hypervisors must. However, one win we see is that if you *are* running nested hypervisors under Pacifica, it will not be necessary to propage a page translation fault from the outermost hypervisor to the next, to the next, etc. – no matter how deep the nesting, the hardware can be made to handle it (with a sufficiently well designed VMM). This would be really cool to play with if I only had the time to implement it! I do like AMD’s enhancements despite its extra complexity.
I’ll try and find out about nested VT – I suggest you switch on to e-mail notifications and I’ll post when I have the answer.
Yet another set of new functions nobody is coding for yet instead of getting off their asses and actually OPTIMIZING THE EXISTING OPCODES.
They should be EMBARRASSED by their piss-poor clock to instruction ratio… It’s like they’ve been resting on their laurels since 1998.
Does virtualization allow more than one OS to run at the same time? The way the the article is written I get the impression that only one OS can run at a time, but applications/services will have their own “physical” partition that will lessen/prevent the chance of one task interfering with that of another.
Xen will utilize hardware virtualization to allow an unmodified version of windows to run (very quickly) under linux. Woo! Free as in open-source beer!
From what I gather from the arstechnica article
http://arstechnica.com/news.ars/post/20051114-5565.html
Intel’s Vanderpool will allow multiple OSs at the same time. Check out the ars article for a better description of what this is all about.
This would be real nice for Xen to take advantage of as they have been working with Intel (and AMD also) on their virtualization technology. AMD’s own virtualization technology, named Pacifica, should appear some time Q1/Q2 2006.
Xen can already run Windows XP under Intel’s VT, although the support for this requires further optimisation and enhancement. Most of this code has come from Intel themselves. AMD have committed to contribute Pacifica support to Xen, as soon as possible.