WEBVTT

00:00.000 --> 00:29.000
Okay, hello, so quickly switch back, so happy to be here for the opportunity to talk a bit about what we are doing in the confidential containers project in terms of attestation and our challenges that we have in this particular environment.

00:29.000 --> 00:57.000
I'm working at Microsoft in the Azure Collinux organization and I'm involved in the CNCF confidential containers project as a contributor and the context quickly so confidential containers is an effort to introduce confidential computing into the container ecosystem.

00:57.000 --> 01:13.000
And confidential computing today is mostly a VM technology, so this is a bit of a friction because containers usually are not launched in VMs.

01:13.000 --> 01:30.000
There's like different boundaries, namespaces that are being used to isolate containers, but if you want to kind of retrofit confidentiality to containers then we have to at VMs.

01:30.000 --> 01:44.000
So luckily, this is not something that we have to start from scratch because there are like prior efforts to do exactly this for maybe using containers in my detention situations.

01:44.000 --> 02:03.000
You want to have a strong virtualization boundaries across in an environment and cataccontainers is one of those projects that we're existing and a good fit for the project to integrate into.

02:03.000 --> 02:16.000
Yeah and quickly I think this is simplified but it's it's important to understand the context what it's actually going on today when you launch a container so it's.

02:16.000 --> 02:30.000
Usually he would do this on a Kubernetes deployment Kubernetes cluster a user with right there their spec.

02:30.000 --> 02:48.000
The container spec and when we talk about container spec often it's it's actually multiple containers that are logically grouped into a pot which is like a collection of collocated containers we can think of maybe a local cash or a message queue like.

02:48.000 --> 03:06.000
It's not necessarily just one process and this is what we call pot and those specs we deployed to API server and the API server kind of dispatches this to the notes where they actually work load is running and on the notes we have.

03:06.000 --> 03:21.000
The component that's called a Q blood and that Q blood itself then starts interfacing with the actual container layer so it used to be Docker under there today it's a bit more abstracted container runtime interface.

03:21.000 --> 03:37.000
There's different container runtime's most popular ones probably container D today and this one is again translating it into OCI runtime course and you have a part of the runtime that is really doing the.

03:37.000 --> 03:52.000
C groups the name spacing and spawning the processes that are that we defined in our pot spec so this is how it looks in a more or less vanilla set up for containers today.

03:53.000 --> 04:12.000
And yeah this like I mentioned the sandbox that you see below this terminology in the container runtime space where we basically hosted the pots like this deployment item I was talking about the collocated processes and.

04:13.000 --> 04:23.000
This is more or less what people work with and it's a good interface for confidentiality to to to kind of build upon.

04:23.000 --> 04:31.000
Yeah, you can also build for example say we will introduce the confidentiality reality boundary on the on the note.

04:31.000 --> 04:37.000
There's also options to do this but this has several problems because.

04:38.000 --> 04:47.000
There might be untrusted code that is deployed on the note I know a lock scraper or other.

04:47.000 --> 05:00.000
Like administrative processes and the confidential containers project settled on the on the pot as an extraction and that introduces some challenges.

05:00.000 --> 05:15.000
So as we see here like the confidentially m boundary is sitting between the container runtime and if you will the container runtime back and in our case it's.

05:15.000 --> 05:22.000
The cutter agent that's spawning processes and kind of being the watchdog.

05:22.000 --> 05:25.000
And.

05:25.000 --> 05:38.000
This can actually also be the case that it doesn't it's not necessarily a VM that is on the running on the note this architecture allows you also to.

05:38.000 --> 05:48.000
Have a confidential VM that is remote so this is one flexibility of this cutter architecture that there's also remote hypervisor there is not actually local.

05:48.000 --> 05:54.000
But can just talk to for example cloud APIs and in this case it works.

05:54.000 --> 06:06.000
Pretty well because anyway for confidential VM so you have very strong isolation between the host and the VM so it's not like you can do a lot of research sharing anyway.

06:06.000 --> 06:30.000
Yeah, the at the station architecture for confidential containers is not very exotic I would say like you have a workload that is measured that is included in evidence and there's an.

06:30.000 --> 06:39.000
At the station service in a key broker service that's called trustee that's part of the project that is kind of very close to confidential containers but it can also be.

06:40.000 --> 06:47.000
But I think there's nothing really special in this like in terms of the.

06:47.000 --> 06:52.000
Yeah, the ceremonies of the station that confidential containers does.

06:52.000 --> 07:00.000
What's more exotic is the fact that we have to deal with a lot of dynamism.

07:00.000 --> 07:07.000
And this is something that we need to measure like this is sandbox that we see in earlier.

07:07.000 --> 07:28.000
Is one part of what you call workload like that is static components we see below it the cutter agent and the base OS and this is all you can measure you can use measured boot for that so this is I think.

07:28.000 --> 07:30.000
Because.

07:30.000 --> 07:41.000
Less challenging than than managing measuring the dynamic sandbox and since focus on.

07:41.000 --> 07:54.000
Yeah, so generally maybe as a as it would be marked like the Docker OCI container themselves are already have some.

07:54.000 --> 08:05.000
Convenient way of of measuring them built in so they're the content that restable if you reference them by the digest.

08:05.000 --> 08:15.000
You get and you can verify that is actually what you're pulling for an OCI registry is what you but you intend.

08:15.000 --> 08:20.000
To what you specified.

08:20.000 --> 08:24.000
The problem is that sandboxes are controlled empirically.

08:24.000 --> 08:31.000
So what we're seeing is like this is a lock from the cutter RPC.

08:31.000 --> 08:39.000
Basically and but what we get from from this earlier sketch from the Kubernetes.

08:40.000 --> 08:45.000
Specks I distraint slides into a list of of calls in this case.

08:45.000 --> 08:52.000
It's really great to sandbox and there's some exchange.

08:52.000 --> 09:05.000
But then there's a container being created that's like the first thing that happens is actually two containers so one is a pause container that is kind of an implementation detail of Kubernetes.

09:05.000 --> 09:30.000
But there's as you see like those processes are not something that is very declarative it's pretty imperative and it is also not arbitrarily how those are executed like the pause container needs to be first or in a container needs to be launched first and this is critical.

09:30.000 --> 09:51.000
So I mean the objective stays the same for for confidential containers and the end we want to ensure that the only the operations that are being executed are the ones that we intend to run before we release a secret it has to be verified in remote attestation.

09:51.000 --> 09:59.000
And this means like we have to somehow get to a notion of a container workload how we can measure this and.

09:59.000 --> 10:09.000
Yeah, as I mentioned before so this is we have to deal with the dynamic nature and it's not just that.

10:10.000 --> 10:19.000
Yeah, that the commands are the imperative but there's also inherent dynamisms in Kubernetes that.

10:19.000 --> 10:38.000
For example, services that are in the cluster are injected as environment variables for service discovery and an environment variable can can if you just without very fine it can support confidentiality.

10:39.000 --> 10:51.000
Yeah, that just seemed like someone changes the red house very heavily and talking to a local red house and more but to the remote red house to this pretty critical.

10:51.000 --> 11:19.000
And also often there's a governance built into Kubernetes services so in the user wants to deploy container a version one but the platform team decides that you cannot do this and they will always rewrite your container to version two transparently so this is something that is also a challenge in those environments.

11:19.000 --> 11:31.000
And there's like different options to deal with this one of the option that's being discussed is locking the control plan in a way that you really only.

11:31.000 --> 11:40.000
To identify the trustable with predictable operations but it's very hard.

11:40.000 --> 11:52.000
For my perspective it's very hard to do this because the Kubernetes API surface is pretty large and it's always increasing and it would rely on which require a lot of.

11:52.000 --> 12:07.000
And we keep up with this and have a providing experience that is not completely foreign to users of Kubernetes but there's an effort on the way it's called split API that tries to do this.

12:07.000 --> 12:14.000
I don't really know where we are with this but I think people are actively working on this.

12:14.000 --> 12:21.000
We also have the option of keeping a log a bit like like IMA.

12:21.000 --> 12:32.000
The problem that we have in confidential computing is that not all T is at the moment provide registers that we can extend at runtime.

12:32.000 --> 12:46.000
And as I mentioned like some payloads are not really predictable because environment controls them so the verification of such a log is also not something this previous but I wouldn't rule this out in the end.

12:46.000 --> 12:58.000
We have to get to architecture that is somewhat similar.

12:58.000 --> 13:04.000
So finally the option that we ended up with is having a policy in the TE.

13:04.000 --> 13:09.000
So we kind of described the invariance of deployment like something that we really want to know.

13:09.000 --> 13:15.000
Earlier like we want the image digest for example the OCI image digest.

13:15.000 --> 13:21.000
We define or we say like there's acceptable environment variables that can be injected.

13:21.000 --> 13:24.000
We're seeing starting with service.

13:24.000 --> 13:30.000
We check like all cutter RPCs like for example exec into a container.

13:30.000 --> 13:39.000
Loving by default and have users allow list the option in the policy and be a cherry pick what's required.

13:39.000 --> 13:44.000
And this is something that is currently implemented in the cutter agent.

13:44.000 --> 13:54.000
The implemented on using regal which is in Kubernetes land familiar tool.

13:54.000 --> 14:04.000
And cutter also has to generate policy like saying default policy from a pot spec.

14:04.000 --> 14:12.000
And you can see here like how this blocks into the whole architecture.

14:12.000 --> 14:25.000
So in front of the cutter RPC we have a policy engine that is part of the measured static container runtime if you will and this evaluates like every call.

14:25.000 --> 14:37.000
So we have the call and the payload and we check whether there was in a lot parameters.

14:37.000 --> 14:46.000
So you see here like this is an example policy how it looks like you see here we allow a few.

14:46.000 --> 15:03.000
We request we define commands that are allowed and this is something that a user would have to do or has to do in this case.

15:03.000 --> 15:10.000
But those policies can be rather large.

15:10.000 --> 15:18.000
So it's maybe something that we need to automate or abstract a bit.

15:18.000 --> 15:23.000
Yeah so this policy also needs to somehow get provisioned to the TE.

15:23.000 --> 15:31.000
So this is instead of the generic runtime which is static and we can measure and there's a predictive measurement.

15:31.000 --> 15:35.000
The policy is specific for the workload.

15:35.000 --> 15:45.000
And we need to provide it as measured configuration at launch and we also need to link it to the TE.

15:45.000 --> 15:56.000
And this is like one way of doing this I think is pretty typical is somewhere hashing this dynamic configuration.

15:56.000 --> 16:04.000
And store the hashing to S&P's host data or MR config ID for TDX.

16:04.000 --> 16:22.000
Or extend it into VTPM because the very fire can then check that as host data for example is part of the exercise report.

16:22.000 --> 16:26.000
The very fire like this is not changeable or anything.

16:26.000 --> 16:34.000
So if you launch VM and you specify host data you know at verification time that this VM has been launched with this configuration.

16:34.000 --> 16:39.000
So this is how we bind it to the TE.

16:39.000 --> 16:44.000
So it's not just a policy, it's a bit more but we call this init data specification.

16:44.000 --> 16:50.000
For example also to set KDS certificates etc.

16:50.000 --> 16:53.000
But it's rather straightforward.

16:53.000 --> 17:02.000
It's essentially just a dictionary of past and fire content that we provision on the host.

17:02.000 --> 17:04.000
And this is currently being implemented.

17:04.000 --> 17:12.000
It's not available for all TEs but the architecture has been settled and it's probably will be in the next release.

17:13.000 --> 17:22.000
You can see here from the user site it's embedded into the pot spec as an annotation.

17:22.000 --> 17:33.000
We just have I think some mutations here but in essence you run.

17:34.000 --> 17:43.000
You do some basics before I think encoding and then you will provide it in the pot spec that is sent to the API server.

17:43.000 --> 17:49.000
Yeah this would be like a truncated example of the init data that we use.

17:49.000 --> 17:57.000
So there's a hashing algorithm so that you can tell how I find it in host data for example.

17:57.000 --> 18:05.000
Where it is it which I would not have to use to identify this.

18:05.000 --> 18:14.000
Yeah now the challenges that we have with this architecture from my perspective I think the problem is that the policy state left.

18:14.000 --> 18:22.000
State left and it tries to to kind of map it onto an imperative set of calls.

18:22.000 --> 18:28.000
And there's like complex orchestration so if you just watch it like every RPC call individually.

18:28.000 --> 18:33.000
We cannot really express stuff like the init container has to run first.

18:33.000 --> 18:38.000
And if the init container runs second then this is not what we intend.

18:38.000 --> 18:47.000
So now we have to kind of introduce a notion of state in this and this actually I think actually being implemented or there's an open.

18:47.000 --> 18:51.000
We have a set of practical problems that are kind of solvable.

18:51.000 --> 19:12.000
I think the size of policy is Kubernetes has limits in terms of how much you can fit into annotations.

19:12.000 --> 19:24.000
And yeah we have to come up with schemes like compression is probably workable.

19:24.000 --> 19:29.000
But in the end I think the user experience is not really great.

19:29.000 --> 19:41.000
If you have to write this yourself so we have to really build tooling that makes it easy for users to deploy their products into a TV.

19:42.000 --> 19:53.000
And I think the conceptual problem I'm seeing is like if you're really tracking the RPC interface closely in those policies.

19:53.000 --> 20:07.000
It's very easy to introduce like exploit vectors unintentionally because you have to basically model the behavior or have to keep track of the behavior of those APIs also.

20:07.000 --> 20:21.000
And for the people who work on Qatar but not cocoa they would have to cover this very closely to make sure that is also like in the default policy you have.

20:21.000 --> 20:23.000
You have this managed.

20:23.000 --> 20:30.000
And I think on all like there's more challenges like runtime measurement that is coming up now.

20:30.000 --> 20:36.000
I think there's compatibility is with confidential GPUs.

20:36.000 --> 20:46.000
But I think yeah we will address them when the users demand them but I think those are all solvable.

20:46.000 --> 20:50.000
And if you want to recap quickly.

20:50.000 --> 20:56.000
As in the at the station for containers and boxes I think it's the tricky topic.

20:56.000 --> 21:06.000
And the solution right now that we do that we offload the verification to a policy is like a workable mitigation for the time being but maybe not.

21:06.000 --> 21:11.000
The solution that we that we cover all cases.

21:11.000 --> 21:19.000
In the future and I think that would be the talk from my side thanks for listening.

21:20.000 --> 21:25.000
Thank you.

21:25.000 --> 21:32.000
Thank you.

21:32.000 --> 21:34.000
Yes.

21:34.000 --> 21:39.000
So complexity in general doesn't help with security a lot.

21:39.000 --> 21:47.000
So I'm wondering how far we stretch this case in terms of complexity like the chances that people get this right.

21:47.000 --> 21:49.000
Yeah absolutely.

21:49.000 --> 21:53.000
I mean that that's really also my impression that there's two complex I think.

21:53.000 --> 21:56.000
And we have to somehow get a grip like what.

21:56.000 --> 22:02.000
Can we guarantee how do we model this but in the end we have to we have to like yes.

22:02.000 --> 22:07.000
Few we can do about.

22:07.000 --> 22:09.000
The container ecosystem.

22:09.000 --> 22:15.000
So we have to like confidential containers has.

22:15.000 --> 22:24.000
Yeah wants to pretty much pick up users where they are and they want to have essentially just a small annotation in their pot that says confidential through.

22:24.000 --> 22:30.000
And that's all they should do and this is this like the north star then you have to kind of deal.

22:30.000 --> 22:35.000
With what's given to you but we have to somehow I think find.

22:35.000 --> 22:45.000
Yeah a way to reduce this complexity I agree so this is very easy to to inadvertently or maliciously introduce.

22:45.000 --> 22:50.000
Problems in this architecture.

22:50.000 --> 22:53.000
Oh yeah the problem.

22:53.000 --> 23:02.000
The question was the question was like this looks all very complex and it's very easy to like security and complexities don't mix well.

23:02.000 --> 23:06.000
And what can we do to limit those?

23:06.000 --> 23:08.000
Yeah I wrong if we have.

23:08.000 --> 23:10.000
First the top very apart.

23:10.000 --> 23:14.000
Like reducing like the project old like reducing the.

23:14.000 --> 23:16.000
The.

23:16.000 --> 23:22.000
Yeah it's like very.

23:22.000 --> 23:28.000
I think a user centric project that like where I think the applicability.

23:28.000 --> 23:33.000
For the solution was supposed to live there from from the beginning like we have to start.

23:33.000 --> 23:38.000
See like what what can we do that people can actually start using this and and then.

23:38.000 --> 23:44.000
As we go we figure out problems with this architecture and have to see how we can address them.

23:44.000 --> 23:54.000
I mean.

23:54.000 --> 23:56.000
Yeah I mean the.

23:56.000 --> 24:13.000
T is that we mostly work with our.

24:13.000 --> 24:18.000
So as we.

24:18.000 --> 24:22.000
So we have run time but there but.

24:22.000 --> 24:25.000
The idea was to abstract like T is.

24:25.000 --> 24:30.000
To it with minimum and so there's no notion of runtime measurements yet.

24:30.000 --> 24:32.000
Confidential containers.

24:32.000 --> 24:39.000
But this I think will be a problem very soon because we're like they're looking at very long running training.

24:39.000 --> 24:44.000
Machine learning jobs rather than engine X containers that people spawn up.

24:44.000 --> 24:49.000
So we have to probably to think about continuous.

24:49.000 --> 24:52.000
Measurement.

24:52.000 --> 24:55.000
And we have to see like how do we.

24:55.000 --> 24:59.000
How can we fit this like Azure has this.

24:59.000 --> 25:03.000
Soft is in a way that there's a BTPM.

25:03.000 --> 25:08.000
That is running on a higher privileged level that is isolated.

25:08.000 --> 25:11.000
As a confidential VM from the host, but also from the guest.

25:11.000 --> 25:12.000
So you have.

25:12.000 --> 25:18.000
Kind of this confidential playground extended it's it's a bit more complex.

25:18.000 --> 25:23.000
It's but it's linked to the TE.

25:23.000 --> 25:25.000
Cryptographically.

25:25.000 --> 25:29.000
And there you can it have like.

25:29.000 --> 25:34.000
Use other the runtime tools to do measurements and into PCRs.

25:34.000 --> 25:39.000
So maybe if this architecture as in coconut.

25:39.000 --> 25:45.000
It's also providing a VTPM for for S&P and we have RTM R's and TDX.

25:45.000 --> 25:49.000
So maybe there's a assumption that is reason for the future.

25:49.000 --> 25:53.000
Yeah.

25:53.000 --> 25:57.000
You mentioned trustee and trustee.

25:57.000 --> 25:59.000
So you have to encrypt it.

25:59.000 --> 26:03.000
And then you need to get the encryption key first.

26:03.000 --> 26:05.000
But you want to get that encryption key.

26:05.000 --> 26:08.000
When you run it in a container, you first need to get it.

26:08.000 --> 26:09.000
Of course, through the application.

26:09.000 --> 26:10.000
Then.

26:10.000 --> 26:12.000
So when you're blocked by it, then there you should get it.

26:12.000 --> 26:13.000
Yeah.

26:13.000 --> 26:26.000
So the question was what about like the encrypted images.

26:26.000 --> 26:30.000
There were missing there but the encrypted images are essentially.

26:30.000 --> 26:33.000
So they're not you don't need encrypted images.

26:33.000 --> 26:37.000
And encrypted images like at the moment images are pulled in the guest.

26:37.000 --> 26:40.000
For the for the upstream architecture.

26:40.000 --> 26:45.000
And so what you can do, you can.

26:45.000 --> 26:50.000
The guest OS starts and it does performs like at the stations.

26:50.000 --> 26:53.000
The ceremony and you get.

26:53.000 --> 26:54.000
Yeah.

26:54.000 --> 27:00.000
You're able to to to to to to get the encryption key before you launch the container.

27:00.000 --> 27:03.000
This just part of the pre container.

27:03.000 --> 27:06.000
That's something you need to get.

27:06.000 --> 27:07.000
Yeah, but it's.

27:07.000 --> 27:08.000
Yeah.

27:08.000 --> 27:09.000
Thank you.

27:09.000 --> 27:10.000
Thank you.

27:10.000 --> 27:11.000
Thanks.

27:11.000 --> 27:16.000
Thank you.

