WEBVTT

00:00.000 --> 00:10.000
The next speaker is going to be talking to us about Macros gone wild.

00:10.000 --> 00:12.000
Good morning.

00:12.000 --> 00:18.000
The original alternative title for this talk was you are not expended to understand this

00:18.000 --> 00:24.000
after a famous comment that appeared the sixth edition of the research Unix.

00:24.000 --> 00:28.000
Who of you are familiar with what the CPU processor does?

00:28.000 --> 00:30.000
Okay.

00:30.000 --> 00:34.000
Most so I will skip that it does a source file inclusion, macro replacement,

00:34.000 --> 00:38.000
conditional compilation and other stuff and that we will just move into problems.

00:38.000 --> 00:42.000
Garnet's throw stroke described the pre-process as a constant problem to

00:42.000 --> 00:46.000
probabilities, maintainers, people reporting code and tool builders.

00:46.000 --> 00:50.000
And this indeed the case is faced by this thing from the compiler.

00:50.000 --> 00:54.000
And therefore it's difficult to analyze when you see a syntax error in the compiler.

00:54.000 --> 00:58.000
You don't know how the pre-processor changed the code and that made that error appear.

00:58.000 --> 01:00.000
It's difficult to reason about.

01:00.000 --> 01:06.000
Confuses the C grammar and the semantics you can have parts that are distinct from the C grammar.

01:06.000 --> 01:12.000
The conditional compilation confuses testing and source associated with many traps and pitfalls.

01:12.000 --> 01:16.000
Books are on the use of C programming language.

01:16.000 --> 01:18.000
This is how to use the pre-processor.

01:18.000 --> 01:22.000
So properly otherwise you will get the burnt and crash.

01:22.000 --> 01:26.000
So what I will look at is how the pre-processor is used in the Linux kernel.

01:26.000 --> 01:30.000
And specifically examine four things.

01:30.000 --> 01:32.000
First of all the usage characteristics.

01:32.000 --> 01:34.000
The extent it is used in the Linux kernel.

01:34.000 --> 01:38.000
Second I will and I will most focus.

01:38.000 --> 01:40.000
I will discuss the introduced technical depth.

01:40.000 --> 01:44.000
We'll see how it has changed over time.

01:44.000 --> 01:48.000
And finally we'll discuss the feasibility of this reducing this technical depth.

01:48.000 --> 01:52.000
Things that make it difficult to switch the pre-processor makes our life difficult.

01:52.000 --> 02:00.000
Especially the feasibility of using Rust as an alternative method for doing many of the things that

02:00.000 --> 02:02.000
Microsoft do nowadays.

02:02.000 --> 02:06.000
For this study I use the tool called the C scout.

02:06.000 --> 02:08.000
This is a refactoring browser for C code.

02:08.000 --> 02:12.000
So it ingest C code allows you to study it.

02:12.000 --> 02:16.000
Taking to account the full semantics and tokens of the pre-processor.

02:16.000 --> 02:22.000
So when you see the code and you click on a token it goes back to the token's definition in a

02:22.000 --> 02:24.000
Macro for example.

02:24.000 --> 02:30.000
Even if this has been defined in a way that there is uses the pre-processor.

02:30.000 --> 02:34.000
So it performs both the semantics and that analysis of the code.

02:34.000 --> 02:38.000
And I have extended this to collect a number of metrics.

02:38.000 --> 02:40.000
What happens before the pre-processor and after the pre-processor.

02:40.000 --> 02:44.000
Both at the end of each function and at the end of each file.

02:44.000 --> 02:48.000
And I also added functionality to measure keywords and some other metrics

02:48.000 --> 02:52.000
have to do with the complexity of the code.

02:52.000 --> 02:58.000
I've analyzed the three kernels which you see here.

02:58.000 --> 03:00.000
Paste about 10 years.

03:00.000 --> 03:08.000
In the 10 years the distance the most recent one is 6.10 in July last year.

03:08.000 --> 03:14.000
And as you see there, the number of lines has been released from

03:14.000 --> 03:18.000
4,000 lines to 23,000 lines files.

03:18.000 --> 03:20.000
C files nowadays.

03:20.000 --> 03:28.000
And similarly the number of lines has been released from 5 million lines to 24 million lines of which I analyzed

03:28.000 --> 03:32.000
20 billion lines because I analyzed only a single architecture.

03:32.000 --> 03:36.000
And the one configuration the most complete configuration of all config.

03:36.000 --> 03:40.000
As you can see on the bottom, this required considerable resources.

03:40.000 --> 03:46.000
More than a week of processing time for the latest kernel and large amount of memory,

03:46.000 --> 03:50.000
113 gigabytes.

03:50.000 --> 03:54.000
The analysis was not easy for the two versions.

03:54.000 --> 03:58.000
For the early versions it cannot be analyzed with a modern GCC.

03:58.000 --> 04:02.000
I'm installing no GCC on a modern Linux was in practical.

04:02.000 --> 04:06.000
And also the 32-bit RAM capacity was insufficient to run a C scout.

04:06.000 --> 04:10.000
For this I used QMU and Hypervisor Accelerator.

04:10.000 --> 04:14.000
I had to force the use of deprecated crypto in order to be able to access

04:14.000 --> 04:18.000
a section to the hyper via into the MU later.

04:18.000 --> 04:22.000
And used architect packages because could otherwise

04:22.000 --> 04:24.000
could not install what was required.

04:24.000 --> 04:30.000
And therefore I compiled it on the QMU and then I analyzed it on a powerful host.

04:30.000 --> 04:34.000
For the 6.10 there isn't a version it required more than a week of processing

04:34.000 --> 04:38.000
which by running it again again would take months.

04:38.000 --> 04:42.000
And also a lot of RAM and for this I utilized the super computer.

04:42.000 --> 04:50.000
I split the task into 32 tasks running in parallel and several 32 super computer nodes.

04:50.000 --> 04:54.000
And then I developed a procedure to merge the results on a powerful

04:54.000 --> 04:58.000
node number of different ways to do it

04:58.000 --> 05:00.000
through SQL recursive queries.

05:00.000 --> 05:02.000
Couldn't do it.

05:02.000 --> 05:04.000
I could not merge graphs.

05:04.000 --> 05:10.000
In the end I developed a command in C scout to merge the parts as you see here.

05:10.000 --> 05:14.000
I performed a binary tournament merge 32 processes running in parallel.

05:14.000 --> 05:18.000
These were reduced and hour and a half later and again in two hours

05:18.000 --> 05:24.000
and the three hours the results were almost the merging results were almost ready.

05:24.000 --> 05:32.000
So now we'll describe the findings for the most recent version 6.10.1.

05:32.000 --> 05:36.000
And later I will give some examples of what happened before that time.

05:36.000 --> 05:40.000
So how is the preprocessor used extensively?

05:40.000 --> 05:42.000
33% of the defined functions are defined.

05:42.000 --> 05:44.000
It looks like a function.

05:44.000 --> 05:46.000
It's a macro.

05:46.000 --> 05:50.000
72% of what is defined as an identifier is a macro identifier.

05:50.000 --> 05:56.000
And when you look at the usage 44% of when you see a function call,

05:56.000 --> 05:58.000
it's actually a macro behind it.

05:58.000 --> 06:04.000
And if you see an identifier actually 44% again it is a macro identified.

06:04.000 --> 06:08.000
Also interesting 94% of the macro identifiers are never used.

06:08.000 --> 06:14.000
This is not that bad because most of these definitions have to do with the hardware constants

06:14.000 --> 06:18.000
and I think it's good to define them for the sake of completion.

06:18.000 --> 06:24.000
It's rather than leaving gaps in having people wonder whether we've forgotten something or not.

06:24.000 --> 06:30.000
The distribution of preprocessor, the electives varies among various areas of the kernel.

06:30.000 --> 06:34.000
So we see that in the main part of the cancer on the kernel directory

06:34.000 --> 06:40.000
we see a large number of conditionals probably because the kernel has to serve many purposes,

06:40.000 --> 06:42.000
many architectures and the configurations.

06:42.000 --> 06:46.000
We see under drivers that conditionals are used very little,

06:46.000 --> 06:50.000
probably because drivers target a very specific configuration.

06:50.000 --> 06:56.000
And also on the architecture part arch, then we have a large number of filing clues.

06:56.000 --> 06:58.000
I'm not sure why.

06:58.000 --> 07:02.000
If we look at the expansion, what the expansion, the simply processor,

07:02.000 --> 07:06.000
it does with the expansion, we see that the number of tokens used doubles

07:06.000 --> 07:10.000
from 2,000 to 4,000 per file.

07:10.000 --> 07:14.000
And there are some explosions to 3 million tokens post expansion.

07:14.000 --> 07:18.000
The same number of statements or declarations

07:18.000 --> 07:22.000
that the compiler sees, it rises from 170 to 300.

07:22.000 --> 07:28.000
And the number of operators from 300 to 760.

07:28.000 --> 07:36.000
Also the if statements increase from 23 to 36 per unit looked.

07:36.000 --> 07:40.000
And also there's a huge number of increase in the number of go-to labels,

07:40.000 --> 07:42.000
but this happens for a very specific purpose.

07:42.000 --> 07:44.000
So I know why.

07:44.000 --> 07:48.000
You 32 and you 16 also are part of labels.

07:48.000 --> 07:52.000
So this conflates those things.

07:52.000 --> 07:54.000
Why is this?

07:54.000 --> 07:56.000
But let me give you some reasons.

07:56.000 --> 08:00.000
Name space pollution at the beginning of its function.

08:00.000 --> 08:02.000
We have 106 global namespace occupants.

08:02.000 --> 08:04.000
Identifies that are visible there.

08:04.000 --> 08:06.000
That's the median value.

08:06.000 --> 08:10.000
Also each marker is used in 81 files, so it's used a lot.

08:10.000 --> 08:14.000
And the 10 most used frequently defined macro names

08:14.000 --> 08:16.000
are defined 30,000 times.

08:16.000 --> 08:20.000
So the same macro name is defined again again in various functions.

08:20.000 --> 08:22.000
30,000 times.

08:22.000 --> 08:28.000
And these are used 152,000 times in 2000 files.

08:28.000 --> 08:30.000
Another thing that happens is namespace confusion.

08:30.000 --> 08:34.000
Look at this dream, but there are many of these.

08:34.000 --> 08:36.000
For example, this defiles BCH.

08:36.000 --> 08:38.000
Our log fills as a macro.

08:38.000 --> 08:42.000
Doing something X with reads time and some value.

08:42.000 --> 08:44.000
Later on this is redefined.

08:44.000 --> 08:48.000
X is defined to create a name out of what happens before.

08:48.000 --> 08:50.000
What of the name?

08:50.000 --> 08:52.000
And creates new names.

08:52.000 --> 08:54.000
And these are become part of an enumeration.

08:54.000 --> 08:57.000
As you see here by invoking this macro.

08:57.000 --> 09:00.000
And later on again, X is defined in a different way.

09:00.000 --> 09:03.000
BCH log fills is called again,

09:03.000 --> 09:05.000
again, invoked again as a macro.

09:05.000 --> 09:08.000
And these accesses members of a structure.

09:08.000 --> 09:12.000
So at the same time, the same identifier must be part of a structure

09:12.000 --> 09:19.000
member and also part of the name here that is defined

09:19.000 --> 09:23.000
in the generated dynamically through token pasting.

09:23.000 --> 09:26.000
I've looked at all areas of confusion.

09:26.000 --> 09:29.000
And you see here that the markers are become also

09:29.000 --> 09:31.000
in time members of enumeration.

09:31.000 --> 09:35.000
Parts of labels, parts of structure or union members,

09:35.000 --> 09:39.000
or structure union tags, or ordinary identifiers.

09:39.000 --> 09:43.000
So these are confused between things that the simple programming language

09:43.000 --> 09:47.000
considers separator in theory separate namespaces.

09:47.000 --> 09:49.000
But macros confused the two.

09:49.000 --> 09:52.000
So if you change a structure or union tag,

09:52.000 --> 09:56.000
you may also need to change the corresponding go to label.

09:56.000 --> 10:00.000
And another thing is that in the coding style,

10:00.000 --> 10:02.000
there are actually forbidden, but it actually happens.

10:02.000 --> 10:04.000
So macros should not affect control for it.

10:04.000 --> 10:07.000
You're not, it's not good to return from a macro.

10:07.000 --> 10:10.000
And there should also access not access local variables

10:10.000 --> 10:13.000
if defined outside of a function.

10:13.000 --> 10:16.000
And yet here's an example of scoping confusion.

10:16.000 --> 10:20.000
This macro here uses BCH, defined outside the function.

10:20.000 --> 10:23.000
200 lines later, we have a definition of BCH.

10:23.000 --> 10:29.000
21 lines later, this macro is involved using this BCH value.

10:30.000 --> 10:32.000
Control flow confusion.

10:32.000 --> 10:39.000
And this happens 3,000 to 7,700 times this happening.

10:39.000 --> 10:44.000
Control flow confusion, you see here again this macro that returns something.

10:44.000 --> 10:48.000
And later on it's called 77 lines later, it's called,

10:48.000 --> 10:51.000
and this return happens automatically.

10:51.000 --> 10:52.000
This doesn't happen a lot.

10:52.000 --> 10:54.000
I found 12 instances of continue.

10:54.000 --> 10:58.000
40 of break 80 go to which are but troubling

10:58.000 --> 11:03.000
and 97 return statements in such macros defined outside functions.

11:03.000 --> 11:06.000
When I gave this talk at ETH, can I develop it?

11:06.000 --> 11:09.000
Yes, yes, yes, yes, but these are the people working on drivers.

11:09.000 --> 11:12.000
And not with don't do it in the kernel directory.

11:12.000 --> 11:14.000
Actually check after that.

11:14.000 --> 11:18.000
And you see here what happens in the kernel, these violations.

11:18.000 --> 11:22.000
And invariable scope actually happens more under the kernel directory.

11:22.000 --> 11:27.000
Another thing are hybrid call paths.

11:27.000 --> 11:31.000
So the case where we don't have C functions calling C functions,

11:31.000 --> 11:34.000
which is the most common or C functions calling macros.

11:34.000 --> 11:38.000
But more complicated stuff, we have instances of macros calling other macros.

11:38.000 --> 11:43.000
And there are almost half a million like 3 chains of C functions calling

11:43.000 --> 11:45.000
another C function via a macro.

11:45.000 --> 11:48.000
If you try to look at this in the debugger, you will not find it.

11:48.000 --> 11:51.000
If you create the call graph using the object file definitions,

11:51.000 --> 11:54.000
you will not find this calls through the macro.

11:54.000 --> 11:56.000
You will find them directly inexplicably so,

11:56.000 --> 12:00.000
because this will not appear thus in the source code.

12:00.000 --> 12:05.000
Another thing that this actually made me study this thing is expansion explosion.

12:05.000 --> 12:10.000
So about 500 files expand to more than 1,000 per cent.

12:10.000 --> 12:12.000
The median is 87%.

12:12.000 --> 12:17.000
And there are found 30 outliers that take 14 seconds to compile,

12:18.000 --> 12:21.000
whereas most files compile in less than 2 seconds.

12:21.000 --> 12:24.000
Let me show you an example here.

12:24.000 --> 12:29.000
This is a file set up dot C from x86 Zen.

12:29.000 --> 12:33.000
It's about 1,000 lines, 26 kilobytes.

12:33.000 --> 12:38.000
When I expand it becomes 50 megabytes, 88,000 lines.

12:38.000 --> 12:43.000
It takes about 7 minutes to compile and 3 gigabytes of RAM.

12:43.000 --> 12:46.000
And I will try to zoom here.

12:46.000 --> 12:50.000
So what I like to see here is the file.

12:50.000 --> 12:53.000
On the right we see where I am, it's very small dot.

12:53.000 --> 12:59.000
And now I'm zooming, moving a bit up and zooming again.

12:59.000 --> 13:04.000
And zooming again, you see the red square and larger.

13:04.000 --> 13:09.000
Okay, we see some code here.

13:10.000 --> 13:26.000
And actually, with a few weeks later after I found it,

13:26.000 --> 13:29.000
it was actually fixed with this commit.

13:29.000 --> 13:31.000
It does excessive expansion.

13:31.000 --> 13:34.000
It was called to the main 3 macro.

13:34.000 --> 13:38.000
There are also complexity metrics that computer scientists study

13:38.000 --> 13:40.000
to see how difficult it is to understand the code.

13:40.000 --> 13:42.000
Things called cyclomatic complexity.

13:42.000 --> 13:48.000
Having to do with jumps around the core and the graph of the instructions.

13:48.000 --> 13:50.000
Jumping from one place in the other.

13:50.000 --> 13:52.000
And this increases for 4 to 7.

13:52.000 --> 13:54.000
This is called a cyclomatic complexity.

13:54.000 --> 13:57.000
And the hard set volume, having to do with the identifiers.

13:57.000 --> 13:59.000
Visible at a given point.

13:59.000 --> 14:02.000
Also in the medium volume increases for 85 to 180.

14:02.000 --> 14:07.000
Which means that it's more difficult to reason about the code and test it.

14:07.000 --> 14:08.000
Other things.

14:08.000 --> 14:13.000
There are composite identifiers piece together through markers about 150,000.

14:13.000 --> 14:15.000
Extensive include hierarchies.

14:15.000 --> 14:22.000
So there are some outlier compilation units that include 1.5 million number of lines.

14:22.000 --> 14:24.000
Each compilation unit.

14:24.000 --> 14:29.000
So it thinks we compile, takes about 2,000 files, imports it.

14:29.000 --> 14:33.000
And also 36 include file outliers, have a depth of 12 nesting.

14:33.000 --> 14:36.000
So something includes something else, include something else 12 times.

14:36.000 --> 14:40.000
And that also found several cyclical includes defect dependencies.

14:40.000 --> 14:45.000
So in total 170,000, 7 per compilation unit.

14:45.000 --> 14:49.000
The longest one consists of 10 elements.

14:49.000 --> 14:51.000
So it's this cycle here.

14:51.000 --> 14:53.000
Of course, this is not an instrument to the core.

14:53.000 --> 14:57.000
Cursion because we protect include files from re-including themselves.

14:57.000 --> 15:01.000
But nevertheless, when we fancy such things, it's difficult to break them.

15:01.000 --> 15:04.000
It's difficult to reason what's happening here.

15:04.000 --> 15:08.000
How has this evolved over time?

15:08.000 --> 15:11.000
So here are the three kernels I looked at.

15:11.000 --> 15:12.000
Two things are good.

15:12.000 --> 15:14.000
Conditional directives are falling.

15:14.000 --> 15:17.000
And include directives are also falling a bit.

15:17.000 --> 15:19.000
Which means, especially in conditional directives.

15:19.000 --> 15:21.000
It's difficult to test stuff.

15:21.000 --> 15:24.000
It's good that they are being reduced.

15:24.000 --> 15:28.000
But other things you see that they are still increasing names with confusion.

15:28.000 --> 15:30.000
They use of the concatenation operator.

15:30.000 --> 15:32.000
So how can we reduce this technical depth?

15:32.000 --> 15:34.000
First of all, one practical thing.

15:34.000 --> 15:37.000
I found that about 5 million object-like macros,

15:37.000 --> 15:41.000
almost most of them, can simply be rewritten.

15:41.000 --> 15:42.000
Either as a static const value.

15:42.000 --> 15:45.000
And I've verified that juicy compiles it,

15:45.000 --> 15:48.000
makes it doesn't take any memory at all.

15:48.000 --> 15:49.000
It's not a problem.

15:49.000 --> 15:50.000
It's not a problem.

15:50.000 --> 15:51.000
It's not a problem.

15:51.000 --> 15:52.000
It's a problem.

15:52.000 --> 15:53.000
It's a problem.

15:53.000 --> 15:54.000
It's a problem.

15:54.000 --> 15:55.000
It's a problem.

15:55.000 --> 15:56.000
It's a problem.

15:56.000 --> 15:57.000
It's a problem.

15:57.000 --> 15:58.000
It's a problem.

15:58.000 --> 16:00.000
It doesn't take any memory at all.

16:00.000 --> 16:03.000
It compiles the code as if it was defined as a macro.

16:03.000 --> 16:05.000
If it's used as a macro.

16:05.000 --> 16:07.000
So without taking it's address, for example,

16:07.000 --> 16:09.000
or assigning to it.

16:09.000 --> 16:12.000
Or they can also be defined as an emulation.

16:12.000 --> 16:16.000
Remember, which makes it possible to use it as a compile-time constant.

16:16.000 --> 16:19.000
For instance, for declaring the size,

16:19.000 --> 16:22.000
define the size of an array with it.

16:22.000 --> 16:26.000
And this is a possible for 77% of the macros.

16:26.000 --> 16:29.000
For the rest, the values probably not a compile-time constant,

16:29.000 --> 16:31.000
about 1 million of them.

16:31.000 --> 16:35.000
Or the value is used as a token concatenation,

16:35.000 --> 16:38.000
or a stingyization, about 90,000 of them.

16:38.000 --> 16:41.000
Or the values used in a simply processor constant.

16:41.000 --> 16:44.000
So it's appears in the nif, if death.

16:44.000 --> 16:45.000
And so on.

16:45.000 --> 16:47.000
And for 23,000 of them.

16:47.000 --> 16:51.000
But for a large majority, we could actually do it.

16:52.000 --> 16:55.000
Well, now let's move back to something more difficult.

16:55.000 --> 16:59.000
The function-like macros, about 100,000 of them.

16:59.000 --> 17:02.000
I've calculated by looking at cases that cannot be easily converted.

17:02.000 --> 17:06.000
That about half of them could be converted into C.

17:06.000 --> 17:11.000
And for the rest, we've heard it in a number of sessions through this conference so far.

17:11.000 --> 17:15.000
Last could promise an answer because, as a more powerful type system,

17:15.000 --> 17:19.000
it allows for typed, syntactic, correct, complete macros.

17:19.000 --> 17:24.000
And torsive can process code declaratively by manipulating the syntax.

17:24.000 --> 17:27.000
And that allows to do more complex things.

17:27.000 --> 17:33.000
So as something that the Germans very elegantly call the Duncan experiment,

17:33.000 --> 17:37.000
a thought experiment, let's think about what it would mean to use

17:37.000 --> 17:42.000
rest to change existing function-like macros into a last code.

17:42.000 --> 17:47.000
It's not really feasible because each change would have to happen together with the rest of the code.

17:47.000 --> 17:51.000
But let's look at what it would involve.

17:51.000 --> 17:55.000
32,000 function-like macros are not used as functions,

17:55.000 --> 17:59.000
so they create data structures or code dynamically.

17:59.000 --> 18:05.000
But they could be converted into rest macros.

18:05.000 --> 18:08.000
9500 of them used talking concatenation for this.

18:08.000 --> 18:13.000
We could use the last concatenance feature.

18:13.000 --> 18:18.000
5,000 used non-object parameters, so they take a parameter that's not an object,

18:18.000 --> 18:21.000
like a function name or a variable name, but for this,

18:21.000 --> 18:25.000
we could use the rest macros meta variables.

18:25.000 --> 18:29.000
1700 of them have some modifications, rest has a stringify,

18:29.000 --> 18:32.000
so this could also work.

18:32.000 --> 18:34.000
200 of them affect control flow.

18:34.000 --> 18:37.000
For this, we could use rest macros or ideally we should

18:37.000 --> 18:41.000
refactor them not use a not effect control flow,

18:41.000 --> 18:45.000
so not return from inside the macro.

18:45.000 --> 18:51.000
33 used the type of for this could use type traits or generic parameters.

18:51.000 --> 18:55.000
Again, I suspect that more used generic types,

18:55.000 --> 19:00.000
so this number may be larger, but rest gives a very good solution for this,

19:00.000 --> 19:04.000
and this could also improve the code's quality.

19:04.000 --> 19:08.000
And also 88 have incomplete syntax, so they have any dangling open bracket

19:08.000 --> 19:10.000
or dangling close brace.

19:10.000 --> 19:15.000
And for this, I think we should be refactor there, also very few.

19:15.000 --> 19:21.000
So overall there are 24 to 43,000 macros about 44%

19:21.000 --> 19:27.000
that could be handled by using the more powerful features of rest.

19:28.000 --> 19:32.000
So to see the overall, these are all the object,

19:32.000 --> 19:35.000
like are all the macros.

19:35.000 --> 19:39.000
These are all C constants that could be

19:39.000 --> 19:42.000
moved directly to C objects.

19:42.000 --> 19:44.000
These are the rest of the objects,

19:44.000 --> 19:48.000
like things that cannot be directly defined as C constants.

19:48.000 --> 19:50.000
And from the function like macros,

19:50.000 --> 19:54.000
some would require rest and another large part

19:54.000 --> 19:59.000
can probably be converted into C directly.

19:59.000 --> 20:02.000
So to conclude what have we seen here,

20:02.000 --> 20:07.000
we have seen that they use of the C preprocessor in the Linux kernel

20:07.000 --> 20:12.000
is extensive, introduces technical depth in all preprocessor dimensions,

20:12.000 --> 20:15.000
so no matter how you use the preprocessor,

20:15.000 --> 20:18.000
in many cases, something bad happens.

20:18.000 --> 20:21.000
The usage is still growing in a number of areas,

20:21.000 --> 20:23.000
and it can be quite expensive to address

20:23.000 --> 20:27.000
there is no simple solution that can help us here.

20:27.000 --> 20:30.000
So for the short term, what could we do,

20:30.000 --> 20:33.000
fix macros explosions, and for example,

20:33.000 --> 20:35.000
where this has already happened.

20:35.000 --> 20:37.000
We can correct frequent cyclic,

20:37.000 --> 20:40.000
frequently occurring cyclic dependencies.

20:40.000 --> 20:42.000
I fixed one as an experiment over the summons

20:42.000 --> 20:45.000
has already been merged.

20:45.000 --> 20:48.000
I think it would be good to reduce other ones.

20:48.000 --> 20:53.000
And also consider converting those 77% of the object-like macros

20:53.000 --> 20:55.000
that can be converted into C constants,

20:55.000 --> 20:58.000
doing this as a benefit they will appear in the debugger,

20:58.000 --> 21:02.000
and it will be more easier to reason about it.

21:02.000 --> 21:04.000
In the longer term, we can prioritize

21:04.000 --> 21:06.000
refactoring a function-like macros,

21:06.000 --> 21:08.000
either into C, or it is possible,

21:08.000 --> 21:11.000
or into rest when modules are converted into rest

21:11.000 --> 21:16.000
or created into rest from the beginning.

21:17.000 --> 21:19.000
This brings me to the end.

21:19.000 --> 21:21.000
I hope you find it useful. Thank you.

21:21.000 --> 21:22.000
Thank you.

21:22.000 --> 21:24.000
Thank you.

21:24.000 --> 21:26.000
Thank you.

21:31.000 --> 21:33.000
All right. Any questions?

21:33.000 --> 21:35.000
All right.

21:39.000 --> 21:41.000
I'm sorry.

21:42.000 --> 21:46.000
So, I think you're all wondering what

21:46.000 --> 21:48.000
why are people doing that, right?

21:48.000 --> 21:50.000
If stuff can easily be written as a C function,

21:50.000 --> 21:51.000
why do we write a macro?

21:51.000 --> 21:55.000
Because they used it from 1980s, or what is your theory?

21:55.000 --> 21:57.000
So, where are we using macros?

21:57.000 --> 22:01.000
I looked, especially when I told that macros have a problem.

22:01.000 --> 22:04.000
I looked back and I was saying why people

22:04.000 --> 22:07.000
consider size of it to be equals to size of pointer,

22:07.000 --> 22:08.000
when can I do it?

22:08.000 --> 22:09.000
Can I do it?

22:09.000 --> 22:11.000
You said that don't do it.

22:11.000 --> 22:12.000
One answer was that can I do it?

22:12.000 --> 22:14.000
You didn't have children at the time.

22:14.000 --> 22:16.000
We told somebody don't do it.

22:16.000 --> 22:19.000
So, in seriousness, now,

22:19.000 --> 22:20.000
why are we doing it?

22:20.000 --> 22:22.000
I think there's a historical precedent.

22:22.000 --> 22:25.000
Early compilers didn't have the optimization

22:25.000 --> 22:27.000
capabilities that modern compilers have.

22:27.000 --> 22:30.000
So, for example, they could not in line functions

22:30.000 --> 22:32.000
that were very small.

22:32.000 --> 22:34.000
Whereas with a macro, this happens automatically.

22:34.000 --> 22:36.000
So, you have the compiler.

22:36.000 --> 22:37.000
The same with constants.

22:37.000 --> 22:40.000
So, now that you see recognizes that something is not used.

22:40.000 --> 22:44.000
It doesn't need to allocate memory, but all compilers were very primitive

22:44.000 --> 22:47.000
and would allocate memory for that.

22:47.000 --> 22:51.000
And constant wasn't even available in the first version of C.

22:51.000 --> 22:55.000
So, a large part, I think, of macro usage has to do with helping

22:55.000 --> 23:00.000
compilers that were not very, very sophisticated.

23:00.000 --> 23:03.000
But also, there are other uses that are defined

23:03.000 --> 23:07.000
so generic functions that can be used with multiple types,

23:07.000 --> 23:08.000
such as minimum.

23:08.000 --> 23:12.000
We saw explored before for conditional compilation

23:12.000 --> 23:14.000
for which there is no alternative.

23:14.000 --> 23:17.000
And also, the C doesn't have a powerful module system.

23:17.000 --> 23:22.000
So, for this, we will use the include directive and header files.

23:22.000 --> 23:25.000
Thank you.

23:25.000 --> 23:28.000
Anything else?

23:28.000 --> 23:29.000
Yeah.

23:41.000 --> 23:44.000
In your sort experiment for a conditional rest,

23:44.000 --> 23:47.000
what will you see as the unit of conversion?

23:47.000 --> 23:49.000
Because obviously we are not going to go.

23:49.000 --> 23:51.000
Sorry, a bit louder, please.

23:51.000 --> 23:54.000
In your sort experiment, to convert into rest,

23:54.000 --> 23:58.000
what will you see as the minimum unit for conversion to rest?

23:58.000 --> 24:02.000
Because obviously we are not going to connect in digital Microsoft.

24:02.000 --> 24:05.000
Exactly, this white was a thought experiment.

24:05.000 --> 24:08.000
The other thing is, if I understood the question correctly,

24:08.000 --> 24:12.000
what to do with the rest of the code, because the code will need to be converted into rest.

24:12.000 --> 24:14.000
Is that a question?

24:14.000 --> 24:15.000
Yes.

24:15.000 --> 24:19.000
So, this is a very larger question where the cannot answer as part of this.

24:19.000 --> 24:24.000
The study people are looking at ways to convert C into safe rest,

24:24.000 --> 24:27.000
because in converting into rest is not difficult,

24:27.000 --> 24:30.000
to safe and read the rest is difficult.

24:30.000 --> 24:33.000
And there are a number of tools that help with that,

24:33.000 --> 24:35.000
such as C to safe rest.

24:35.000 --> 24:38.000
LLMs can help in small areas.

24:38.000 --> 24:41.000
But that's a huge project,

24:41.000 --> 24:45.000
and not sure even that the community is looking at something

24:45.000 --> 24:48.000
that is universally good thing to do.

24:48.000 --> 24:52.000
So, it's beyond the scope of where this study.

24:52.000 --> 24:54.000
That's another question there.

25:00.000 --> 25:01.000
Hello.

25:01.000 --> 25:03.000
Thank you for the presentation.

25:03.000 --> 25:08.000
One thing that I would like to find out from your research is that

25:08.000 --> 25:10.000
the last year we're moving into rest.

25:10.000 --> 25:13.000
The tenant is moving into rest gradually,

25:13.000 --> 25:15.000
and there is a lot of funding there.

25:15.000 --> 25:16.000
So, that's also a very good question.

25:16.000 --> 25:18.000
If you're moving into the right direction,

25:18.000 --> 25:19.000
it's difficult to say it before hand.

25:19.000 --> 25:21.000
Rust has many advantages,

25:21.000 --> 25:24.000
because clearly created as a systems programming.

25:24.000 --> 25:29.000
LLMs and it offers us the ability to write safe code in many areas,

25:29.000 --> 25:32.000
while still being near the hardware.

25:32.000 --> 25:34.000
So, this is a very interesting question.

25:34.000 --> 25:36.000
I would like to ask you a question.

25:36.000 --> 25:38.000
I would like to ask you a question.

25:38.000 --> 25:40.000
I would like to ask you a question.

25:40.000 --> 25:45.000
In many areas, while still being near the hardware.

25:45.000 --> 25:53.000
I think that C is becoming more and more difficult to use in advanced cases.

25:53.000 --> 25:55.000
And there's also a matter of mind share.

25:55.000 --> 25:58.000
So, it's difficult for younger generations to learn

25:58.000 --> 26:01.000
because no longer taught in many universities.

26:01.000 --> 26:05.000
And this makes it difficult to have a community of new developers

26:05.000 --> 26:08.000
who will contribute in the care.

26:08.000 --> 26:10.000
So, I agree maybe Rust has problems,

26:10.000 --> 26:13.000
maybe more difficult to learn than C,

26:13.000 --> 26:18.000
but we need to move forward and bring new people into our community.

26:18.000 --> 26:23.000
And maybe Rust is a part of the solution space.

26:23.000 --> 26:25.000
All right, thanks a lot.

26:25.000 --> 26:26.000
Thank you.

26:26.000 --> 26:28.000
Thank you.

