WEBVTT

00:00.000 --> 00:07.000
Thank you everyone.

00:07.000 --> 00:12.000
Hello, so today we are going to talk about confidential virtual machines, we will

00:12.000 --> 00:13.000
demystify them for you.

00:13.000 --> 00:19.000
We will try to understand the system stack that supports confidential virtual machines.

00:19.000 --> 00:24.000
And for the first part of the presentation, we will learn about what enlightenment means.

00:24.000 --> 00:28.000
And for the second part of the presentation, we are going to speak about how we can achieve

00:28.000 --> 00:29.000
that enlightenment.

00:29.000 --> 00:31.000
So, I am Arjuna.

00:31.000 --> 00:32.000
This is Ankitav.

00:32.000 --> 00:34.000
We are both from Asharlirak's team at Microsoft.

00:34.000 --> 00:36.000
Asharlirak's is a first-party distribution.

00:36.000 --> 00:42.000
And lately we have been working on enabling CVMs on Asharl's confidential virtual machines.

00:42.000 --> 00:43.000
So, let's begin.

00:43.000 --> 00:48.000
What they expect by the end of this talk, you should be able to understand what confidential

00:48.000 --> 00:49.000
virtual machines are.

00:49.000 --> 00:51.000
What are some cloud provider trends.

00:51.000 --> 00:56.000
And so in this talk, we will be focused on confidential virtual machines provided by

00:56.000 --> 00:57.000
top providers.

00:57.000 --> 01:02.000
Also, these concepts that we will learn about UK, IC, output, measure, boot, whatever.

01:02.000 --> 01:06.000
These are as applicable to bar metals also.

01:06.000 --> 01:11.000
But we will be focused on cloud providers and we will learn about certain gaps.

01:11.000 --> 01:16.000
And you will learn about what you need to configure to enable your operating system

01:16.000 --> 01:23.000
image for confidential virtual machines.

01:23.000 --> 01:24.000
Okay.

01:24.000 --> 01:27.000
So, let's start from basics.

01:27.000 --> 01:32.000
Confidential virtual machines are an implementation of confidential compute, protects our data

01:32.000 --> 01:33.000
in use.

01:33.000 --> 01:38.000
And the T level, the trusted execution environment, boundary exists on VM level.

01:38.000 --> 01:43.000
So, that means your data in users predicted from the high-privileged layers such as your host

01:43.000 --> 01:47.000
OS or hypervisor or emulator devices.

01:47.000 --> 01:49.000
It offers the most flexible approach.

01:49.000 --> 01:54.000
For example, compare to the application level on clear based implementations of T, you can just

01:54.000 --> 01:58.000
if you have something on your cloud instance or in your VM, you can just easily lift

01:58.000 --> 02:02.000
and shift that application inside a confidential virtual machine as of today.

02:02.000 --> 02:07.000
Provides remote administration facilities, you can gain the stress that you are running inside

02:07.000 --> 02:09.000
a confidential virtual machine environment.

02:09.000 --> 02:12.000
You can derive that route of press from the hardware.

02:12.000 --> 02:15.000
It is obviously backed by CPU vendors.

02:15.000 --> 02:19.000
The hardware extensions make it possible to encrypt memory.

02:19.000 --> 02:22.000
AMD 7SMP, Intel TDX, there are many others that support this.

02:22.000 --> 02:26.000
We are going to be focused on AMD 7SMP and TDX for this talk.

02:26.000 --> 02:29.000
Cloud providers are no stranger to this technology.

02:29.000 --> 02:36.000
They have been adopting it and they have been enabling SKUs on their inferred to support this

02:36.000 --> 02:39.000
into N2D and C3 are recent announcements.

02:39.000 --> 02:44.000
And Microsoft for example, have also supported our confidential GPU with confidential virtual

02:45.000 --> 02:47.000
machines.

02:47.000 --> 02:50.000
So why are we talking about this?

02:50.000 --> 02:57.000
Confidential machines allowed cloud providers to guarantee this that you now have your

02:57.000 --> 02:58.000
customer data.

02:58.000 --> 03:01.000
They will have more access to your sensitive workload.

03:01.000 --> 03:04.000
It is protected from your higher privileged layers.

03:04.000 --> 03:09.000
So now as a user, for example, you have sensitive application running on your

03:09.000 --> 03:10.000
machine.

03:10.000 --> 03:11.000
You can reduce costs.

03:11.000 --> 03:13.000
Maybe you can move them onto the cloud.

03:13.000 --> 03:17.000
So it increases those use cases and they just are taking.

03:17.000 --> 03:24.000
They are enabling these for you.

03:24.000 --> 03:25.000
Okay.

03:25.000 --> 03:28.000
What is memory encryption in CVM and chain then?

03:28.000 --> 03:31.000
How does the system stack differ?

03:31.000 --> 03:36.000
So you do need a special stack for management communication and ensuring the security

03:36.000 --> 03:39.000
benefits of confidential virtual machine.

03:39.000 --> 03:42.000
Compared to a normal VM and compared to a normal VM.

03:42.000 --> 03:44.000
For example, you are host and VMM.

03:44.000 --> 03:50.000
Once your VMM has allocated the memory for the guest, it has more access to that environment

03:50.000 --> 03:52.000
that gets itself.

03:52.000 --> 03:57.000
The control is then with the cost gas firmware, gas firmware takes the initial measurements

03:57.000 --> 03:59.000
and it will store it.

03:59.000 --> 04:01.000
And then guest kernel is CVM aware.

04:01.000 --> 04:06.000
What that means is that you need to enable some configurations in the guest kernel itself.

04:06.000 --> 04:13.000
And that is required because guest kernel is responsible for example for declaring what

04:13.000 --> 04:18.000
pages are private to it and then there are some shared pages also between host and guest

04:18.000 --> 04:21.000
that are taken care of by guest kernel.

04:21.000 --> 04:26.000
The VMM VMM communication transferring from guest to host or host to guest mode happens through

04:26.000 --> 04:29.000
VM exit events in a normal VM.

04:29.000 --> 04:35.000
Whereas in confidential virtual machines, it happens through VMG exit, for example, in AMD and

04:35.000 --> 04:38.000
TV call, it happens in TVX.

04:38.000 --> 04:43.000
And this is a specialized communication protocol that I developed for VM and VMM communication.

04:43.000 --> 04:49.000
For IO operations and MMI, the direct access is there in normal VM and happens through VM

04:49.000 --> 04:50.000
exit events only.

04:50.000 --> 04:54.000
But in confidential VM as we discussed, it happens through explicit VMM calls.

04:54.000 --> 04:58.000
In normal VM, IO will raise exceptions such as VC and VE.

04:58.000 --> 05:03.000
And direct memory access, open device access and normal VM, but in confidential VMs,

05:03.000 --> 05:05.000
it happens through a dedicated bounce buffer.

05:05.000 --> 05:09.000
So that also sometimes adds overhead and confidential GPUs,

05:09.000 --> 05:20.000
an example where you can have a dedicated device to T where the device itself is

05:20.000 --> 05:21.000
attested.

05:21.000 --> 05:26.000
And special communication protocol exists between the device and the confidential virtual

05:26.000 --> 05:27.000
machine itself.

05:27.000 --> 05:36.000
So moving on is memory encryption enough to guarantee confidentiality and cloud VMs.

05:36.000 --> 05:38.000
Let's learn about that.

05:38.000 --> 05:42.000
So VMs and guest OS, where is the problem?

05:42.000 --> 05:46.000
confidentiality guarantee does not apply to guest OS.

05:46.000 --> 05:51.000
What that means is that once your inside, once you have your memory allocated to the guest,

05:51.000 --> 05:55.000
what is running inside the guest itself, let's see your system executables,

05:55.000 --> 05:58.000
they can see the workload that is running within it.

05:58.000 --> 05:59.000
They have access to it.

05:59.000 --> 06:04.000
And secondly, as I said, there are pages which are declared which are shared between guest

06:04.000 --> 06:08.000
and host and that is taken care of by guest kernel itself.

06:08.000 --> 06:13.000
So you have to be careful about your guest OS itself.

06:13.000 --> 06:17.000
And that is why protecting guest OS is important.

06:17.000 --> 06:19.000
And how do you do that?

06:19.000 --> 06:20.000
How do you do that?

06:20.000 --> 06:24.000
For example, when the launch VM, they will load this image from somewhere.

06:24.000 --> 06:25.000
It is lying somewhere.

06:25.000 --> 06:29.000
They will bring up your guest and you need to protect this image.

06:29.000 --> 06:33.000
So you have to ensure your data at rest and data in use both are protected.

06:33.000 --> 06:35.000
Data in use is protected with memory encryption.

06:35.000 --> 06:40.000
And for description, we'll protect your data at rest as well.

06:40.000 --> 06:43.000
And you have to achieve this without relying on the host.

06:43.000 --> 06:46.000
So you cannot give your decryption keys to the host.

06:46.000 --> 06:50.000
It will allow us to encrypt the OS image.

06:50.000 --> 06:53.000
And you have to also support at end it and at end it boot.

06:53.000 --> 06:57.000
You cannot type in your password because host can slip on it.

06:57.000 --> 07:00.000
You're trusting your cloud provider with that password.

07:00.000 --> 07:02.000
You cannot do that.

07:02.000 --> 07:05.000
The solutions to that, you can use VTPM.

07:05.000 --> 07:09.000
VTPM's, VTPM's are higher chips.

07:09.000 --> 07:12.000
That support, they can store your keys.

07:12.000 --> 07:16.000
And you can use them to also perform side-of-the-date options,

07:16.000 --> 07:18.000
like decryption, encryption, etc.

07:18.000 --> 07:21.000
But you need to place some trust in your cloud provider.

07:21.000 --> 07:25.000
Because these are usually implementations from cloud itself.

07:25.000 --> 07:28.000
Alternative work, remote administration.

07:28.000 --> 07:31.000
In your boot path somewhere, you attach to the environment itself.

07:31.000 --> 07:37.000
And then bring your OS image and bring it up.

07:37.000 --> 07:40.000
You don't trust the cloud at least.

07:40.000 --> 07:45.000
How would a consumer know what it is and how it is?

07:45.000 --> 07:49.000
So your evidence, it is derived from root of trust.

07:49.000 --> 07:54.000
PSP, which is that one-secret processor, encoding on-kift, T-DX and AMD.

07:54.000 --> 07:59.000
Where how would basic audio modules you can generate your attestation evidence from there.

07:59.000 --> 08:02.000
They are signed by keys from Haru vendors.

08:02.000 --> 08:04.000
Then you can fetch your certificates.

08:04.000 --> 08:06.000
So the vendors would give you certificates.

08:06.000 --> 08:12.000
You can use those certificates to verify the authenticity of the support and case your test in the environment itself.

08:12.000 --> 08:14.000
Talk to others.

08:14.000 --> 08:17.000
Usually you have their implementation of verifier, such as MA, Google administration,

08:17.000 --> 08:19.000
Agent or Intel just authority.

08:19.000 --> 08:21.000
You can build your own verifier.

08:21.000 --> 08:25.000
If you don't want to place any trust in those as well.

08:25.000 --> 08:28.000
Okay.

08:28.000 --> 08:29.000
Okay.

08:29.000 --> 08:30.000
Okay.

08:30.000 --> 08:31.000
I'll skip this diagram.

08:31.000 --> 08:34.000
But it shows how the attestation happens.

08:34.000 --> 08:37.000
Basic and AMD service empty was intended to T-DX.

08:37.000 --> 08:42.000
I'll transfer to Ankata for file data.

08:42.000 --> 08:43.000
Here it's enough.

08:43.000 --> 08:45.000
So you should have time, but okay.

08:45.000 --> 08:47.000
So what is the guest kernel?

08:47.000 --> 08:49.000
Out of the guest OS need to take care of at this point.

08:49.000 --> 08:51.000
It needs to take care of the kernel.

08:51.000 --> 08:54.000
We need to have some patches applied already for service empty and T-DX.

08:54.000 --> 08:57.000
We need to enable certain kernel configurations.

08:57.000 --> 09:00.000
Then for protecting the data at rest.

09:00.000 --> 09:02.000
We need to have full description.

09:02.000 --> 09:05.000
Secure boot is needed so that at each step of the boot chain,

09:05.000 --> 09:07.000
we know that only trusted goods are executing.

09:07.000 --> 09:12.000
And then since we are encrypting the disk, we also need to give the password only

09:12.000 --> 09:15.000
when the system isn't a good known state.

09:15.000 --> 09:18.000
So these are the kernel configurations that you must enable.

09:18.000 --> 09:21.000
The first two ones are the mandatory which are necessary.

09:21.000 --> 09:25.000
The second kernel configurations basically are optional.

09:25.000 --> 09:30.000
And they are mainly if you want to attest using the hardware root of trust.

09:30.000 --> 09:34.000
So how do you boot the Linux in a secure fashion?

09:34.000 --> 09:36.000
This is a traditional Linux boot chain.

09:36.000 --> 09:38.000
We have the form where which loads the boot loader.

09:38.000 --> 09:40.000
Then they'll kernel followed by any parameters.

09:40.000 --> 09:44.000
And then we as the user for the password and then you'll be Crypto OS volume.

09:44.000 --> 09:49.000
However, with this secure boot, we have a signature check at each step.

09:49.000 --> 09:54.000
And this applies until the Linux kernel where, you know, the boot chain is secured until

09:54.000 --> 09:56.000
your Linux kernel.

09:56.000 --> 09:58.000
So this is how the secure boot works.

09:58.000 --> 10:01.000
But is this still enough for a confidential VM?

10:01.000 --> 10:02.000
Actually, no.

10:02.000 --> 10:07.000
Because what if some arbitrary in a parameters comes into picture at this point?

10:07.000 --> 10:10.000
How do you still protect your boot chain?

10:10.000 --> 10:15.000
So this has been simplified with the concept of UKI or unified kernel image.

10:15.000 --> 10:21.000
Where we combine all these three components, the kernel in a parameters and the kernel

10:21.000 --> 10:27.000
come online into a single UEFI binary, which can be easily integrated with the UEFI secure boot.

10:27.000 --> 10:30.000
And it is signed by the certificate of the distribution vendor.

10:30.000 --> 10:35.000
So this is how the CVM secure boot flow would look like with the help of UKI.

10:35.000 --> 10:38.000
You don't need to sign those three components separately.

10:38.000 --> 10:40.000
Or you don't need to take care of them separately.

10:40.000 --> 10:43.000
You can use the UKI for that matter.

10:44.000 --> 10:47.000
Why you have already spoken about these three points?

10:47.000 --> 10:52.000
However, there is a slight flexibility tradeoff that comes with the security of the UKI.

10:52.000 --> 10:54.000
We have a static in a parameters.

10:54.000 --> 10:58.000
So you cannot do any runtime module customization with your kernel.

10:58.000 --> 11:00.000
Secondly, the kernel command line is static.

11:00.000 --> 11:04.000
So you need to have a mechanism to discover your root volume.

11:04.000 --> 11:09.000
You cannot just go and put in the GUID of your root volume and it's going to be gripped.

11:09.000 --> 11:15.000
And then there is no traditional UI like the bootloader UI that we have usually.

11:15.000 --> 11:22.000
So then, as we already spoke about the system, the boot chain is secured using secure boot.

11:22.000 --> 11:28.000
But we still need to know that, are we revealing the password for our root volume to a good system state?

11:28.000 --> 11:32.000
It is actually the genuine CVM that we want to reveal it to.

11:32.000 --> 11:35.000
So that is where the measured boot comes into picture.

11:35.000 --> 11:38.000
It is a standard way for checking the authenticity of our boot chain.

11:38.000 --> 11:43.000
And this is done using PPMs and something called platform configuration registers or PCR.

11:43.000 --> 11:46.000
What happens is at each step of the boot chain.

11:46.000 --> 11:50.000
We basically take the hash of each component of the boot chain.

11:50.000 --> 11:52.000
For example, the shame, the system reboot, etc.

11:52.000 --> 11:56.000
And then all of these hashes are recorded in your TPMs.

11:56.000 --> 12:01.000
When the system boots, these hashes are recorded while the booting process.

12:01.000 --> 12:06.000
And then, finally, the PCR prediction is checked with the action.

12:06.000 --> 12:09.000
You know, PCR measurements that we already had in the TPM.

12:09.000 --> 12:14.000
If both of them match only then, we, you know, reveal the key to the root volume.

12:14.000 --> 12:18.000
So that we can be gripped at OS root volume from there on.

12:18.000 --> 12:22.000
So we have been talking so much about this encryption.

12:22.000 --> 12:24.000
Let's see how the display out differs.

12:24.000 --> 12:29.000
So now, standard VM, you just have a boot partition, a root partition.

12:29.000 --> 12:32.000
Optionally, you have encryption in the root partition.

12:32.000 --> 12:36.000
Where in the boot partition, we have all the graph files or the boot loader files.

12:36.000 --> 12:38.000
The kernel command line in the traffic, etc.

12:38.000 --> 12:43.000
However, all of that has been squeezed down into the UK with our CVM.

12:43.000 --> 12:48.000
Talking about the root partition, we use usually standard encryption and we store the password.

12:48.000 --> 12:52.000
Either, you know, in a file or we from the user to give it to us.

12:52.000 --> 12:54.000
In this case, we don't need to do that.

12:54.000 --> 12:58.000
We use LUKS for encryption and then there are certain key slots,

12:58.000 --> 13:01.000
which are sealed by the TPM as we already spoke about.

13:01.000 --> 13:03.000
And they can be used for the decryption.

13:03.000 --> 13:07.000
Of course, the root volume is encrypted with the user data.

13:07.000 --> 13:12.000
So this is a quick flow of how the free description encryption and decryption is working.

13:12.000 --> 13:15.000
We have an LUKS container on the root volume.

13:15.000 --> 13:17.000
Which generates a master key.

13:17.000 --> 13:18.000
We create a TPM policy.

13:18.000 --> 13:20.000
See the key with the TPM.

13:20.000 --> 13:24.000
And then, you know, we store that key in the metadata as I already spoke about.

13:24.000 --> 13:25.000
And we encrypt the root volume.

13:25.000 --> 13:27.000
This is how the root decryption works.

13:27.000 --> 13:30.000
And we have already spoken about this one.

13:30.000 --> 13:33.000
So that brings us to the end of the store.

13:33.000 --> 13:34.000
That time is also up.

13:34.000 --> 13:36.000
Just one minute I'll take for this one.

13:36.000 --> 13:40.000
So this is what, you know, other key takeaways.

13:40.000 --> 13:43.000
And we hope you were in lighten by the end of this talk.

13:43.000 --> 13:44.000
Thank you.

13:44.000 --> 13:45.000
Just a quick.

13:45.000 --> 13:47.000
Thank you.

13:47.000 --> 13:52.000
Thank you. Thank you so much.

13:53.000 --> 13:54.000
Do you have other references?

13:54.000 --> 13:58.000
And I would just like to mention we have a talk coming up to water by vitally in the conference.

13:58.000 --> 14:00.000
In the virtualization bedroom.

14:00.000 --> 14:02.000
On a similar topic, with some updates.

14:02.000 --> 14:04.000
So please go and check it out.

14:04.000 --> 14:05.000
Thank you.

14:06.000 --> 14:07.000
Thank you.

14:07.000 --> 14:09.000
Thank you.

14:09.000 --> 14:10.000
Thank you.

