WEBVTT

00:00.000 --> 00:20.000
All right, I think we're good to go. Let's learn about rest on Linux.

00:20.000 --> 00:25.080
Yeah, welcome to my talk on Restful Linux. I'll give you an overview of the project, what

00:25.080 --> 00:35.880
it is. First one, my name is Anis. Anis Astier. The live blogger of the Canon Recipes Conference,

00:35.880 --> 00:41.640
a great Canon conference in Paris, and actually to check it out. Small disclaimer, I'm not

00:41.640 --> 00:47.320
a restful Linux contributor. Just this talk to learn more about restful Linux, I invite you

00:47.320 --> 00:54.040
on the contrary to look at Miguel Oreda stock yesterday, it was in Jansson. It was recorded, of course,

00:54.200 --> 01:00.600
a very interesting presentation. So what is restful Linux? First of all, it's a meta project,

01:00.600 --> 01:05.880
just like the Canon, you know, the Canon is a group of multiple projects that merge every 10 weeks,

01:06.680 --> 01:12.440
and restful Linux is just that it's a set of multiple different projects that want to use rest

01:12.440 --> 01:20.200
in the Linux Canon. The goal of the project is to make rest second the main language for the Canon

01:21.160 --> 01:30.600
in general, not just for drivers, but in general in the Canon. And upstream using, making sure

01:30.600 --> 01:38.280
everything is upstream the core goal of this project. A quick history of restful Linux, it started

01:38.280 --> 01:47.480
in 2013 with a demo, it was a small, a lot of module, rest.co. You can basically, basically,

01:47.480 --> 01:53.400
we can click the links if you go, check out the slides. There are online on the page, so every link

01:53.400 --> 02:01.480
you see is kickable. Then, fast forward in 2019, Alex Gainor and Jeffrey Thomas, give a talk

02:01.480 --> 02:08.760
at the Linux Security Summit and inviting kind of developers to use rest for security reasons.

02:08.760 --> 02:15.080
A year later in 2020, there was a discussion at the Linux Plumber conference with many people already

02:15.160 --> 02:23.720
involved. A year later, Miguel Ojeda sends the first pull request with many, many contributors.

02:24.440 --> 02:31.480
He did the project announcement along with the pull requests, and the restful Linux experiment

02:31.480 --> 02:36.920
was merged into Linux 6.1, so it was three years ago, almost three years ago.

02:37.880 --> 02:44.520
And ever since 6.1, at every release, of course, there were new changes. At first,

02:44.520 --> 02:47.800
at the beginning there was nothing, there was just the infrastructure to be able to build

02:49.080 --> 02:53.080
thing with rest, but there was nothing to build. Now there's a bit more and we'll talk about it.

02:55.400 --> 03:01.960
First of all, why rest? Usually when you pick a mainstream programming language, you have to

03:01.960 --> 03:06.760
make this trade-off. You have to pick two of those. You want to be able to do dynamic memory

03:06.760 --> 03:12.280
allocations. Is your program going to have everything statically hard-coded or not? Do you want to

03:12.280 --> 03:19.720
have memory safety? I prevent the full of flows, data races, things like that. And do you want to

03:19.720 --> 03:24.840
have a garbage collection or not? And the language that can be free running, you can run without any

03:25.000 --> 03:31.000
runtime at native speed. Well, with rest, you don't need to pick, you get all three, and that's

03:31.880 --> 03:36.440
a part of mainstream language. There's no other one that comes to mind, that has exactly the same

03:37.320 --> 03:47.320
properties. Of course, memory safety is just one aspect of the equation. And the last XDC

03:47.320 --> 03:53.240
lead poll, she's a kernel developer, she said that memory of safety is a list convincing point.

03:54.280 --> 04:00.360
Linux kernel developers, they've been using C for decades, they know it's unsafe, so it feels like

04:00.360 --> 04:06.280
they're being talking down to when you keep repeating the same arguments about memory safety.

04:06.280 --> 04:13.320
And ergonomics are what she says a bit more interesting, because rest can be a kernel

04:13.320 --> 04:20.120
material. You can encode properties of your kernel subsystem of your drivers, of your APIs,

04:20.680 --> 04:27.080
in rest, which you can't do in seeing, you have keeping your minds because of the many features

04:27.080 --> 04:33.240
in rest, you have an image pattern matching trade, you have ownership and like time tracking,

04:34.280 --> 04:41.080
ball-checking. Of course, there are reasons why one would not want to use rest. For example,

04:41.160 --> 04:47.240
it's a new language, so you already know C learning text time, and if you've been programming

04:47.240 --> 04:53.160
for a long time, you know usually you can pick up a new language pretty quickly. With rest,

04:54.120 --> 04:58.440
it's a bit different, like you find walls, you find things you are used to that you can't do.

04:59.880 --> 05:05.320
So it takes a bit of time. There are other reasons, of course, for example, the Linux kernel

05:05.400 --> 05:13.240
supports 22 or 23 architecture families, sorry, just in general, and resty has about half of those

05:15.240 --> 05:20.680
up to tier two targets. So if you want, there's a bit more, but it's not exactly the same number

05:20.680 --> 05:25.800
of architectures, so some architectures, supposedly by Linux won't be able to build rest code.

05:26.840 --> 05:33.160
Until we have DCC at least. So right now for the main architectures, you can build the kernel with

05:33.240 --> 05:41.320
clank and DCC, and this is support for rest code this coming via two projects. So there's

05:41.320 --> 05:47.240
DCC RS, which is working with upstream to add rest support directly into DCC,

05:49.080 --> 05:56.200
and there's rest code gen DCC, which is another project whose goal is to change the rest

05:56.200 --> 06:04.440
C backend to use DCC for cogeneration. So instead of LLVM, there are other by the way rest C backend,

06:04.440 --> 06:12.280
like crane leaf, but yeah, it's south of scope of here. What is the strategy of the rest for Linux

06:12.280 --> 06:19.240
projects for working with the Linux kernel? First, the project wants to lead by example. So it

06:19.320 --> 06:27.320
wants to have documentation everywhere with tests, safety commands. So when you're doing rest,

06:27.320 --> 06:33.960
some path might be unsafe, and every safe there's a link in the Linux kernel should have

06:33.960 --> 06:41.320
a safety comment explaining why it's used. It uses a bind gen for FFI layer generation. So FFI

06:41.320 --> 06:45.400
means for instruction interface. It's when you want to bring two languages, usually right

06:45.400 --> 06:51.640
and interface either by hand or not. Usually not, you prefer to have this automated. So there's

06:51.640 --> 06:58.840
a tool called bind gen that will parse the C headers of C files and generate a rest FFI. So it's

06:58.840 --> 07:06.680
basically rest code with unsafe functions. Those unsafe functions on top of it, the rest for an

07:06.680 --> 07:12.680
project says we should not use them directly in the account. We should build safe abstractions

07:12.680 --> 07:21.240
on top of those bindings. So you don't call the bindings directly. What are those safe abstractions?

07:21.240 --> 07:27.240
It's exactly what we were talking about. It's a way to encode the constraints,

07:27.240 --> 07:32.920
which you have in the C code, the C layer, and you need to understand those constraints to put them

07:34.120 --> 07:41.000
in rest types. That's where you need really kernel domain expertise. Next, please,

07:41.000 --> 07:46.920
read it on the specific kernel domain you are abstracting. So usually it's not trivial to do,

07:46.920 --> 07:53.720
but it depends. In fact, there are already quite a few abstractions in Linux, many APIs,

07:53.720 --> 08:00.840
for example, you might know about work use, device, miss device, which has merit recently,

08:00.840 --> 08:06.120
platform device for embedded developers and any kind of people using platform device.

08:07.080 --> 08:13.880
There's an alloc module to do allocations. There's a PIGNM space, which was done by

08:13.880 --> 08:22.840
a Christian here for a driver for the Android driver. Anyway, there are many others, it's just a selection,

08:22.840 --> 08:27.880
and this is quite long, there are a lot of things you can do in rest, but for many use cases,

08:27.880 --> 08:33.160
it will still be too short. We'll still at the beginning and you might find that you might want to

08:33.160 --> 08:38.600
use something in the kernel, some API that's not abstracted, and this would need to be written.

08:40.200 --> 08:46.840
So where are we with drivers? First of all, in the Linux kernel, there's this rule,

08:46.840 --> 08:50.840
called the node duplicate rule. So if you want to merge the driver, it should not have

08:52.120 --> 08:57.560
another driver, which has the signature for the same hardware. So of course, this rule has been

08:57.560 --> 09:03.640
relaxed in some cases, and for in the rest of the next project, it's being relaxed for what

09:04.520 --> 09:11.720
are called reference drivers. So this reference drivers serve as examples of how to write

09:11.720 --> 09:20.200
a rough code. And one of those drivers is the null block driver, which was merged upstream recently,

09:20.200 --> 09:25.000
so it's a complete implementation of the C button. For now, there are the two ones

09:25.080 --> 09:30.680
that exist in the Linux kernel. There are other things that were merged. For example, the DRM

09:31.560 --> 09:38.680
panic curr code generation. So it's kind of like a Linux blueprint of that, except not. So you have

09:40.680 --> 09:44.680
when you panic, you have a lot of, a wall of text with a lot of information, the stack trace,

09:44.680 --> 09:49.480
the state of the registered row. And instead of taking a picture or copying that by hand,

09:49.480 --> 09:53.640
you just kind of curr code, and you have also data, you can copy test it and send it in your

09:53.640 --> 10:00.120
degree port. And also curr code generation is done in rest. There are two five drivers for

10:00.120 --> 10:06.600
network, social network key and thing that have already been merged. And of course, that's just

10:06.600 --> 10:11.400
the tip of the IPS third. There are many upcoming drivers, many things that are being worked on,

10:12.280 --> 10:20.600
one, which is well known, will be the other Linux Apple GPU driver. It's not upstream yet,

10:20.680 --> 10:25.800
but it's being shipped, if you use as a Linux, it's being shipped to everyone with the

10:25.800 --> 10:33.880
running Linux. There's a re-implementation of the Android Bider driver, which is a core driver

10:33.880 --> 10:40.680
that is extremely Linux. So it's being rewritten with the aim of replacing completely the C

10:40.680 --> 10:49.560
implementation, the same for the NVME. There's an upcoming driver again for GPUs, for NVDPUs,

10:50.680 --> 10:55.720
and many others. Yeah, the list is quite long and many working progress projects.

10:57.560 --> 11:05.480
Recently, in the recent updates, we've had a few changes over the last Linux version. For example,

11:07.080 --> 11:12.120
it used to be that you had to use for to be the rest into a specific error. You had to use the

11:12.120 --> 11:18.680
specific rest compiler version. It's no longer the case since two or three release. Now you use

11:18.840 --> 11:28.040
rest 178, I think. It was picked very consciously because it's package in almost every

11:28.040 --> 11:39.240
district that may be stable, but it's in testing. There are now a few types to wrap the Linux

11:39.240 --> 11:45.320
allocators. So there's KVAC, KBox, which if you know the thing about two about rest,

11:45.320 --> 11:52.760
is just like VAC and Bugs, but for the kernel, they have a slightly different API, and it's

11:52.760 --> 11:58.120
done to be able to pick the type of allocators because they might be multiple allocators in the

11:58.120 --> 12:06.120
kernel and the allocation flags. Another date, which was done recently, was that the rest project,

12:06.840 --> 12:14.200
the rest compiler now builds the Linux kernel in CI. So every PR is tested with the Linux kernel

12:14.520 --> 12:20.760
build to make sure that there is no breakage. The rest project takes rest for an extra

12:20.760 --> 12:25.960
sleet, one of its flagship goals for the second half of 2024, and probably will be for the

12:25.960 --> 12:36.600
first half of 2025. What are our kennel mentions thinking about rest? What do they think?

12:36.600 --> 12:42.680
So globally, I'd say there's a positive outlook. As the last maintenance, it seemed that

12:43.800 --> 12:48.120
it was globally positive. There are many supportive maintenance. I give you the example of

12:49.240 --> 12:57.000
Christian here who even wrote an abstraction to have to have encode the properties of a domain

12:57.000 --> 13:04.200
he knew very well. The Linux talk of the steam right ahead to rest developers, so right code,

13:04.200 --> 13:11.400
even if it's still an experiment, and I invite you again to write what Miguel Okadastok,

13:11.400 --> 13:17.880
who did a really great segment yesterday, and he interviewed many different kind of developers

13:17.880 --> 13:25.000
and gave code to what they think on rest for Linux. Supporting rest for Linux is still optional,

13:25.000 --> 13:32.280
still an experiment, or is it depends? It depends on the subsystem. If the maintenance

13:32.360 --> 13:38.680
is supportive, it's not optional. It's part of the features, and the rest for Linux developers never

13:38.680 --> 13:46.280
said that you can break rest wherever you want. In the last case scenario, if there's a disagreement

13:46.280 --> 13:51.640
of something, then you can reach this point, but it's not something that was never said.

13:53.640 --> 14:00.280
So that concludes the first part of this talk. Now we'll show a bit a few code examples to

14:00.280 --> 14:05.320
why rest is interesting in the Linux kernel, and we'll start with this, which is the direct

14:05.320 --> 14:14.360
copy path from the presentation by Alex Alis-Rill and Carlos Yamaz on the binder driver rewrite.

14:14.360 --> 14:20.360
So you have a comparison between C code and rest code. So you see the rest code, you have just

14:20.360 --> 14:27.640
a closing accolade, and that's perfectly normal, that's because of life time tracking, of ownership

14:27.720 --> 14:33.240
tracking, and the way the drop tray works. And of course the C code is only just a part,

14:33.240 --> 14:43.640
it's not even complete yet. Let's now look at the minimal rest drivers. So this is directly

14:43.640 --> 14:49.640
from the kind of source tree, if you go into the examples, you see how to write a module or a driver.

14:49.720 --> 15:01.320
For that, you will first use a prelude, it will import a lot of few things that the

15:01.320 --> 15:06.680
rest for next developers are important into your scope. What's important, for example, you have

15:06.680 --> 15:16.360
the module macro, it allows you to declare your driver, your driver has a type, it's a structure,

15:16.360 --> 15:23.640
we'll see that later, and it has some metadata, and it has a name, a license, or a description,

15:23.640 --> 15:35.640
everything you have in C code usually. And if you look a bit more, I told you the module has a type,

15:35.640 --> 15:43.240
this is basically a structure, and this is the state of the driver itself. Here it's an example,

15:43.240 --> 15:52.600
so the state is an array of numbers, a dynamic, an array of sign 32 bit integrers, which is a

15:52.600 --> 16:00.440
KVAC, I thought I talked to you about it a bit earlier. And then you need to, for this structure,

16:00.440 --> 16:06.680
you need to implement what we call the tray in rest, so the tray is called module,

16:06.680 --> 16:13.000
so you implement the module tray, and what does that mean means that you need to have this function,

16:13.080 --> 16:19.320
the innate function, that's what it means, implementing this tray. And inside this function,

16:19.320 --> 16:26.440
you will see the PR info macro, it's basically almost the same thing as the PR info Linux

16:27.320 --> 16:32.520
kind of function, so it's actually a function, it's also macro, but it's nothing important, and you

16:32.520 --> 16:42.760
declare your dynamic array, so this will be done on the stack, and then you push things to it,

16:42.760 --> 16:50.920
and you will call the push function, like the stack in rest, except, this function can do allocations,

16:50.920 --> 16:55.800
and you will need to pass the allocation flags, because in the corner you might, depending on the

16:55.800 --> 16:59.800
context, you might need to have different flags, for example, if you insert people context,

17:00.760 --> 17:07.080
so you pass the flags, and of course this allocation is available, so it can return an error,

17:07.080 --> 17:16.120
and you pass this error to the return of the function itself, and then you return the states,

17:16.120 --> 17:28.040
so you return the structure, you declare with the numbers inside. The end of this driver is

17:29.000 --> 17:35.320
to exit, when you exit, it was chosen to use a drop tray, which is a built-in tray,

17:35.320 --> 17:41.240
which means that when it goes something goes out of scope, it's called, that's how you do the automatic

17:41.240 --> 17:48.280
screen, as an example, that was shown before, and so the structure will implement the tray,

17:48.280 --> 17:55.080
it has one function, and this function calls the print again, and you print the state that you had before,

17:55.160 --> 18:01.160
so that's basically it, that's the rest, minimal sample, that's in the Linux kernel,

18:01.160 --> 18:13.160
a source tree. Let's show now a bit more complex example from the SAE Linux GPU driver,

18:13.880 --> 18:21.400
and one of the reasons that was set it by SAE Linux, and even later by Devali and Danielo Kremich,

18:21.400 --> 18:27.720
is that rest is very interesting, because it allows us to abstract the firmware layer,

18:27.720 --> 18:34.360
the interaction between the kernel driver and the firmware, and on Apple platforms, and it's

18:34.360 --> 18:41.720
true, so for NVIDIA GPUs, the kernel developers don't control the firmware, it's controlled by

18:41.720 --> 18:48.120
the hardware window, and with every iteration, they might break somethings. In order to prevent

18:48.120 --> 18:56.440
breaking the kernel driver, there's an abstraction inside Linux to help, and this will be a macro.

18:57.320 --> 19:03.080
This macro, it's a pro macro, we won't show it, we will show how to use it, but not how it's implemented.

19:03.080 --> 19:09.320
First of all, it's not this one, this one just macro to tell the rest compiler, I want this

19:09.320 --> 19:14.520
structure to be represented like in C, so it would be C like representation in memory.

19:15.240 --> 19:21.560
This is a macro, and it has one argument, which is AGX, and we'll come back to it later,

19:22.520 --> 19:28.520
and inside the definition of the macro, there's this AGX version, so it's the same,

19:28.520 --> 19:35.480
it's referring as exactly that, and the firmware can be selected hardware dependent,

19:36.200 --> 19:41.320
or version dependent, so the G would be generation and V version, and you have

19:41.720 --> 19:49.240
like three hardware that are supported, and six firmware versions. Of course, we don't want to

19:49.240 --> 19:54.520
support all of those, it would be like 18 things, and it would grow even faster, so there are only

19:54.520 --> 20:03.560
five that are supported in Linux, for this out of three driver. Let's go back and take a look,

20:03.640 --> 20:12.200
and what this macro allows to do, you can now add an annotation, this will be again,

20:12.200 --> 20:18.360
an entrepreneurship at build time by the macro, and this annotation allows saying I want this

20:18.360 --> 20:24.200
next field to be or this next expression in general, to be only for this version, and this version

20:24.200 --> 20:32.920
it has an expression in evaluator that's run at build time. You see here's a version, it has to be,

20:33.000 --> 20:38.920
if we want to have the counter, it has to be greater than V13, or equal to V13.

20:40.520 --> 20:48.120
So if we go and take a look at what happens with this macro, it will generate each of the five

20:50.120 --> 20:57.160
possibilities we've seen to generate at build time different structures. You see the structures

20:57.240 --> 21:03.000
and I'm slightly differently, it contains a different generation and the firmware version,

21:03.640 --> 21:09.640
and we can see that there's one that contains the counter, which is because that's how it's

21:09.640 --> 21:15.240
declared in the firmware interface, and the other does not have it. So yeah, that concludes the

21:15.240 --> 21:22.360
this example. I just want to show a small possibility of what could be possible in the future in

21:22.360 --> 21:31.000
Linux, usually when you are using Rust, it's not Rust does not protect against deadlocks.

21:31.000 --> 21:38.360
So you can deadlock, it's not one of the properties perfectly safe to deadlock, depending on the

21:38.360 --> 21:44.200
definition of safe, talking about the Rust definition. Of course, using the Rust programming language

21:44.280 --> 21:53.560
is possible to build a structure in a way that you can never deadlock. So this was presented

21:53.560 --> 22:00.120
by Joshua Libre of Feather at the latest RustConf, in a talk called 16 and not safe world.

22:00.120 --> 22:05.880
I won't go into details, but it explains how to build the context with two traits and the

22:05.880 --> 22:12.760
macro, and how to build basically a new text directly, actually graph, so that the other in which

22:12.840 --> 22:19.800
you can take mutex always depends on the previous one, and you can never take it in a wrong order

22:19.800 --> 22:27.720
if it has to be satisfied at compile time. And it has no compile time. It has no runtime impact,

22:27.720 --> 22:31.640
it's just verified at compile time. If you're a bit curious how it might work, I invite you to

22:31.640 --> 22:41.400
look at it. It works for Fuchsia's NetStack, which has 77 mutex's. And that would be for my presentation,

22:41.800 --> 22:50.920
thanks a lot.

22:55.160 --> 23:03.960
Thank you. The mutex verification, do we need to annotate the mutex's or just a compiler

23:03.960 --> 23:10.120
look at all the mutex's and says, well, this one, does it infer that? Or does it do we have to do it?

23:10.120 --> 23:18.120
It's done manually. So everyone, it works for this project. They have like 77 mutex's. They went

23:18.120 --> 23:25.560
through every one of those, and they defined it an order in a graph. And I think it could be done for

23:26.920 --> 23:34.920
some subsistence, something that needs to be explored. Yeah, I mean, I think I'm thinking about

23:34.920 --> 23:42.120
the giant comment we have at the top of our map.c, which lays out the locking hierarchy inside the

23:42.120 --> 23:48.440
MM, and I'm a little bit scared, but it would be nicer to have that actually verified by the

23:48.440 --> 23:54.600
compiler rather than just a comment that can be ignored.

23:54.760 --> 24:08.520
Other questions? Can you explain why there is no duplicate rule for drivers? Why is there

24:08.520 --> 24:13.000
no duplicate rules for drivers in the kernel? Because you don't want to have duplicate work.

24:13.000 --> 24:17.160
You don't want to have too many people working on different drivers for the same hardware.

24:17.960 --> 24:22.520
So how does that work when you have platforms that are not supported by restaurants and you have

24:22.600 --> 24:27.960
code running means? Someone might want to contribute to driver running restaurants and you

24:27.960 --> 24:31.880
are running a map, or whatever, and it's not supported by restaurants.

24:31.880 --> 24:37.640
Okay, wrong example. Sorry. Try the guess. Try to guess one and support the architecture for PC,

24:37.640 --> 24:44.680
maybe? Some old poor PC? Poor prices with all. M68K is not supported by RIS, which we have

24:45.560 --> 24:52.360
gems here. It's not supported either. DCC is coming. And of course, if you really want to have

24:52.360 --> 24:58.840
you know RIS driver running back you to reinvest in RIS or RIS or RIS code reinvest for your architecture.

25:04.440 --> 25:06.280
All right. Thanks all.

