WEBVTT

00:00.000 --> 00:13.000
Hello everyone, it's quite a room.

00:13.000 --> 00:20.320
Thank you for coming, my name is Jean Amarca and Nick Suez contributor, but also a PhD student

00:20.320 --> 00:23.520
and a researcher at Telecom Paris.

00:23.520 --> 00:28.800
Today I'm going to present work that is more research oriented that I did and the supervision

00:28.800 --> 00:37.800
of Teosimirman and Stefano Zaccholi, which is about Nick Suez's reproducibility.

00:37.800 --> 00:45.000
First of all, what are we talking about, what I'm talking about, what I say reproducible

00:45.000 --> 00:49.600
is the conception of reproducibility defined by the reproducibility group.

00:49.600 --> 00:55.600
So basically, build the reproducibility, build by the reproducibility, which is basically

00:56.600 --> 01:05.600
if given the same source code and improved send environments, you can recreate the same

01:05.600 --> 01:13.600
build, exact same bit by bit identical as another person that would do it and another computer

01:13.600 --> 01:16.600
basically.

01:16.600 --> 01:22.600
Build reproducibility is a concept that is historically pretty much associated with Nick Suez.

01:22.600 --> 01:29.600
Maybe if you have heard about Nick Suez before, there's something that comes to mind when we talk about Nick Suez

01:29.600 --> 01:33.600
is it provides reproducible builds.

01:33.600 --> 01:41.600
This was even directly on the front page of Nick Suez website for years.

01:41.600 --> 01:49.600
So this is a snapshot of 2023 when the wording changed a little bit, but it was like a big marketing argument of Nick Suez

01:49.600 --> 01:56.600
for a while, but it actually not something that is through in a general case.

01:56.600 --> 02:09.600
I'm sorry about the display that is being a bit messed up by the display, but you could easily write down an expression.

02:10.600 --> 02:13.600
Oh, the mic is on, you should have known.

02:13.600 --> 02:18.600
Apparently there's issues with the audio on stream, but it's not your fault, sorry about that.

02:18.600 --> 02:26.600
We could easily write an expression that will generate something non-deterministic.

02:26.600 --> 02:38.600
So just for example, this expression that takes value from the random environment viable will generate two different artifacts when run twice.

02:38.600 --> 02:46.600
So here you can see that's actually Nick Suez doesn't give you a guarantee of build reproducibility.

02:46.600 --> 03:04.600
So it's something that has created, I would say, a little difficulties with people from the reproducible builds group, which are actually the people providing the patch and working with upstream project to make them reproducible.

03:04.600 --> 03:13.600
Because like Nick Suez has used it as a marketing argument for a long time.

03:13.600 --> 03:19.600
So it creates the question, how reproducible truly Nick Suez is?

03:19.600 --> 03:29.600
And to answer this question, there is something you can do is you can go and check out the reproducible.nixware.org page.

03:29.600 --> 03:36.600
Where there is actually some monitoring that exists of Nick Suez's reproducibility.

03:36.600 --> 03:44.600
And on this you can see that the monitoring is done on the ISO images, so the minimal ISO image and the GNOME ISO image.

03:44.600 --> 03:51.600
And you can see that we have excellent reproducibility rate around 100% 99% for this.

03:51.600 --> 03:58.600
And you can even go and and for the the packages that are not reproducible, check out their diffoscope.

03:58.600 --> 04:04.600
The diffuscope is like a visual diff of where a binary artifact differs.

04:04.600 --> 04:13.600
And you could try to understand thanks to the diffoscope where the difference comes from in the small amount of packages that are not reproducible.

04:14.600 --> 04:22.600
The problem here is that this is only monitoring the small subset of Nick's package, the ISO image is the small subset.

04:22.600 --> 04:32.600
And we really have a more general question of what is the proportion of the globality of the Nick's package, a repository that is reproducible.

04:32.600 --> 04:35.600
So why is this question important?

04:35.600 --> 04:41.600
It's because it could have the positive impacts on software supply chain security.

04:41.600 --> 04:53.600
So basically when you are an X user, you could compile your packages on your computer, but this is quite a platform and celly, completely invisible.

04:53.600 --> 04:55.600
So you don't always want to do that.

04:55.600 --> 05:03.600
And sometimes you actually want to download the packages directly from the cache, it's what's most Nick's user do.

05:03.600 --> 05:07.600
And when you do that, you kind of have a problem of trust.

05:07.600 --> 05:16.600
And the problem of trust is that why would you trust the package, the pre-compiled package is that you download from the from a cache.

05:16.600 --> 05:30.600
And it's it's visible that that hacker takes control of the of the ninspiratory cache or the builders and and start providing you instead of the right packages.

05:30.600 --> 05:43.600
Free-compiled packages, it's sub-providing you backdoor package and you could do nothing because then if it's as control of this server, it could sign the packages and then you trust it and it's game game over.

05:43.600 --> 05:57.600
But free-flow-producible bills, if you know that and you have the property that this package can be compiled on several machine and you obtain always the same thing, you could ask other trusted builders to compile it on their own.

05:57.600 --> 06:03.600
And if they obtain all the same result, they can provide you with the hash of this result.

06:03.600 --> 06:11.600
And then you can download the package from the cache and check if you have the same hash and be sure that you are downloading actually the right thing.

06:11.600 --> 06:26.600
So this is why we care about super-visible bills because if you don't have super-visible bills, then you cannot do that because you can ask several builders to compile your package and then they obtain different stuff and then you can say nothing interesting about the result.

06:26.600 --> 06:33.600
So, we are interesting, interested in the reproducible bills, we are interested in how much of these packages are reproducible, but we don't know.

06:33.600 --> 06:42.600
So now, science, come to the rescue and what I'm going to present now is the results from a paper.

06:42.600 --> 06:51.600
I'm going to that is going to be published on a conference named mining software repository this year.

06:51.600 --> 06:58.600
And that's his goal, does functional package management enable reproducible bills as scale? Yes.

06:58.600 --> 07:04.600
Well, this is what we did. This is a pipeline summarizing what we did.

07:04.600 --> 07:13.600
Basically, we took some revision of Nick's package, 17 revision that we were taken from 2017 to 2023.

07:14.600 --> 07:23.600
And for each of these revision, we did in parallel the same thing that's the Nick's continuous integration hydride does.

07:23.600 --> 07:32.600
So we evaluate the release. Nick's value, which is the file that lists all the packages that must be built by the continuous integration.

07:32.600 --> 07:38.600
Then we built all these packages like either of us.

07:39.600 --> 07:46.600
And then, when a package is built successfully, I draw a product to the Nick's binary cache.

07:46.600 --> 07:54.600
And us, what we did is we compare our local compilation result with the historical package that is in the Nick's cache.

07:54.600 --> 07:57.600
So when you do that, there is seven things that can happen.

07:57.600 --> 08:07.600
Either you obtain exactly bit by bit the same artifact, in which case we can say that is bit by double zero, which is what we want.

08:07.600 --> 08:13.600
And otherwise, you could obtain something different here or even fail to reveal the package.

08:13.600 --> 08:19.600
And in this case, we have an analysis by client where we do two things.

08:19.600 --> 08:28.600
First of all, we compute, we generate the default scope, which is the data structure I showed you before, where you can understand better where the reproducibility is.

08:28.600 --> 08:37.600
And with the default scope, we try to analyze it to understand where the package is not reproducible, why it is not reproducible.

08:37.600 --> 08:42.600
And we also have something that I will not read cover today.

08:42.600 --> 08:53.600
We also, by section next package, in order to understand when the anore reproducibility is fixed, which we switch to communicate and in which pull requests and try to understand with the context.

08:53.600 --> 08:57.600
More about how the anore reproducibility are fixed in this package.

08:58.600 --> 09:05.600
So, this is what we found out.

09:05.600 --> 09:20.600
So, this is the proportion of packages that are reproducible in blue or in red, buildable but not reproducible from 2007 to 2023.

09:21.600 --> 09:29.600
And we see that there is an increase of growth in the reproducibility of next package in this time period.

09:29.600 --> 09:46.600
So, starting from about 70% in 2017, we are now reaching about 90% 91% to be precise in the end of our studies in 2023.

09:47.600 --> 09:52.600
You can see something interesting here that I will cover just after.

09:52.600 --> 09:59.600
If you look at this visualization, like more like an absolute one.

09:59.600 --> 10:04.600
So, it is the same result, but instead of proportion, you have absolute number of packages.

10:04.600 --> 10:15.600
And here you can see also the sheer voice of an example from about 20,000 packages to more than 70,000 packages in the time period.

10:16.600 --> 10:22.600
So, in with these goals, we also have a growth of reproducibility.

10:22.600 --> 10:30.600
With one metal exception here, so here you see that between these two points, there is a growth in the number of packages.

10:30.600 --> 10:36.600
But there is a drop in the number of flow-producible packages.

10:37.600 --> 10:43.600
And at this point, we try to understand the cause of this.

10:43.600 --> 10:48.600
And so, we did basically the packages, looked at issues, pull requests, and everything.

10:48.600 --> 10:57.600
And we found that at this point, around here, there was a pipe upgrade where they broke the bio-collegingeration.

10:57.600 --> 10:59.600
It was not deterministic anymore.

10:59.600 --> 11:05.600
And it was actually tracked by the reproducibility people in the package.

11:05.600 --> 11:08.600
They went upstream, they discussed.

11:08.600 --> 11:15.600
They first applied the patch, and then the ups and fix, and they applied the update to people.

11:15.600 --> 11:17.600
And then the regression of a fixed.

11:17.600 --> 11:23.600
This is like the one main regression that we found out in our experiment.

11:23.600 --> 11:27.600
So, this is the first key result that we have.

11:27.600 --> 11:32.600
Currently, 91% of packages are built by a bit for reproducibility.

11:32.600 --> 11:37.600
What we did afterwards is we worked with the diffoscope.

11:37.600 --> 11:45.600
So, basically, for each non-opposcible package, we generated a diffoscope, which is something that can be pretty long.

11:46.600 --> 11:49.600
We put a stop time out at five minutes.

11:49.600 --> 11:54.600
We dedicate five minutes of compute time for each non-opposcible package to generate a diffoscope.

11:54.600 --> 11:58.600
And we got a big set of diffoscopes.

11:58.600 --> 12:02.600
I think you're more than 100,000 diffoscopes.

12:02.600 --> 12:10.600
And on this diffoscope, what we did is we tried first to manually inspect them to see if we can find

12:10.600 --> 12:17.600
like things that you can identify regularly in diffoscope, like common cause of non-opposibilities.

12:17.600 --> 12:28.600
And once we had this, we wrote a realistic that could automatically identify the discosuses.

12:28.600 --> 12:30.600
So, these are like our four realistic.

12:30.600 --> 12:32.600
The first one is dates.

12:33.600 --> 12:38.600
Number one, recommendation in the Opposcible website.

12:38.600 --> 12:43.600
You see, like, stop embedding your client dates in the build result.

12:43.600 --> 12:48.600
And instead, I'm better, like, the starting in timestamp date.

12:48.600 --> 12:52.600
Like, in January, 19 night, or remember the date.

12:52.600 --> 12:56.600
But anyway, always, and if you have to embed the date, always,

12:56.600 --> 12:59.600
I'm better, like, the same one, the deterministic one.

12:59.600 --> 13:01.600
Instead of the date of today.

13:01.600 --> 13:04.600
This is our main cause that we identified.

13:04.600 --> 13:09.600
And it's the blue line here.

13:09.600 --> 13:14.600
And the second one is the, is you name output.

13:14.600 --> 13:22.600
What we call you name output is basically, when in the final artifacts, you embed some impure data

13:22.600 --> 13:23.600
about your builder.

13:23.600 --> 13:27.600
For example, the camera version is running things like that.

13:28.600 --> 13:34.600
So, we detect it quite a lot of them. It's the red line here.

13:34.600 --> 13:36.600
Then there is an environment viable.

13:36.600 --> 13:41.600
So, when some part of the environment looks into the final artifact.

13:41.600 --> 13:48.600
For example, the number of course, that mix uses to build this specific package.

13:48.600 --> 13:52.600
This is the green line here.

13:53.600 --> 13:57.600
And the build idea is some, some ecosystem, like, go,

13:57.600 --> 14:02.600
will generate a unique build idea for a package, unique, but not deterministic,

14:02.600 --> 14:05.600
that they will embed in the final results.

14:05.600 --> 14:11.600
So, this is what the four scores of non-requisitiveity that we have found.

14:11.600 --> 14:18.600
Interestingly, there is, what we can say is, even, like, the most common cause

14:18.600 --> 14:24.600
is, even is present a lot in, in the current mix package,

14:24.600 --> 14:28.600
which means that there is still some, some logging fruits that we can fix

14:28.600 --> 14:31.600
to improve an exponential possibility.

14:31.600 --> 14:38.600
And with those four juristic, we covered, like, about 20% of the cases of non-requisitive

14:38.600 --> 14:43.600
which means that there is still 80% that we, at this point, don't understand well

14:43.600 --> 14:47.600
and we think are more difficult to categorize and understand

14:47.600 --> 14:52.600
and we might need, like, help of expert folks.

14:52.600 --> 14:56.600
A conclusion, yeah, between the opportunity at this point is about 90%.

14:56.600 --> 15:02.600
This justifies, putting some effort to use this as an asset to improve

15:02.600 --> 15:08.600
self-horsal fighting security, which is the way I'm going to do in my next works.

15:08.600 --> 15:11.600
Thank you very much.

15:13.600 --> 15:22.600
I don't know how much time we have for questions, but I'm happy to.

15:22.600 --> 15:24.600
Time for questions.

15:24.600 --> 15:28.600
Who has a question?

15:28.600 --> 15:29.600
Yeah.

15:29.600 --> 15:31.600
Do you need a mic?

15:31.600 --> 15:33.600
I will repeat the questions.

15:33.600 --> 15:35.600
And then, the speaker will be picked up.

15:35.600 --> 15:39.600
When you show the e-plot, showing that the number of packages is

15:39.600 --> 15:43.600
both increasing and the number of, yeah, this one.

15:43.600 --> 15:47.600
No, the second one.

15:47.600 --> 15:48.600
This one.

15:48.600 --> 15:49.600
Yeah.

15:49.600 --> 15:51.600
Are those the same package?

15:51.600 --> 15:55.600
I mean, the red ones are always the same, because the number seems so constant

15:55.600 --> 15:57.600
besides the people.

15:57.600 --> 16:01.600
So the question is, basically, in this red area,

16:01.600 --> 16:04.600
are the non-requisitive package always the same?

16:04.600 --> 16:08.600
And I don't think we precisely analyze that.

16:08.600 --> 16:10.600
This red area is shrinking.

16:10.600 --> 16:14.600
So at the very least, some of them gets fixed, that we know.

16:14.600 --> 16:20.600
But we don't know exactly if there is like a small village of irreducible

16:20.600 --> 16:24.600
packages that refuse to get the reproducible that are always the same here.

16:24.600 --> 16:28.600
But that's something we could try to determine pretty easily.

16:28.600 --> 16:30.600
So I'll follow back on this.

16:30.600 --> 16:32.600
Okay.

16:32.600 --> 16:34.600
There's another question back here.

16:34.600 --> 16:35.600
What's your question?

16:35.600 --> 16:36.600
Yeah.

16:36.600 --> 16:37.600
Is there a red management?

16:37.600 --> 16:44.600
You soak about the thing, a bad store or a back floor here.

16:44.600 --> 16:46.600
You learn what to pay with P.K.

16:46.600 --> 16:47.600
The deal.

16:47.600 --> 16:50.600
But the main order, right in the sprites,

16:50.600 --> 16:54.600
we're heading no, none revered with Reversive.

16:54.600 --> 17:00.600
So the question is, what's the main other trend that can be identified

17:00.600 --> 17:03.600
in non-requisitive, a bit?

17:03.600 --> 17:07.600
So actually, there is not, non-requisitive bills are not threatening itself.

17:07.600 --> 17:13.600
It's a missed opportunity of securing the software situation.

17:13.600 --> 17:14.600
Yes.

17:14.600 --> 17:19.600
What are the worst compilers for this win?

17:19.600 --> 17:20.600
Yes.

17:20.600 --> 17:23.600
What are the worst compilers for the reproducible bills?

17:23.600 --> 17:27.600
So I think we covered this, especially in the precision in the paper.

17:27.600 --> 17:30.600
But the ask-a-lico system is very bad for the reproducibility.

17:30.600 --> 17:40.600
It's about 60% packages of reproducible.

17:40.600 --> 17:43.600
And this is, we think that it's going to improve.

17:43.600 --> 17:50.600
Now that in a, in a, a great day, they added a flag to, to have deterministic, by,

17:50.600 --> 17:52.600
by generation.

17:52.600 --> 17:56.600
So it would be interesting to know if this is going to have an impact.

17:56.600 --> 17:59.600
As far as I know this flag is also not using an example so far.

17:59.600 --> 18:02.600
So we'll see how this goes.

18:02.600 --> 18:03.600
Yes.

18:03.600 --> 18:04.600
Yes.

18:04.600 --> 18:05.600
Now it's certainly good.

18:05.600 --> 18:12.600
If, and good, particularly huge, you will compare the, the either I

18:12.600 --> 18:15.600
Sporey convenient to what you will be at a big time.

18:15.600 --> 18:20.600
So, it's what is added is the team as as what, that is,

18:20.600 --> 18:22.600
by either and the agency is tested is.

18:22.600 --> 18:25.600
You assume that it needs test of reproducible.

18:25.600 --> 18:27.600
But is it not necessary to write?

18:27.600 --> 18:33.960
because the only need that in the app to this is the verification of it to pay it to you.

18:33.960 --> 18:36.720
Yes, so I said, how confident are you?

18:36.720 --> 18:43.760
That's the impact something is still the same then, it's really very positive, man.

18:43.760 --> 18:49.400
Yes, so does you try to be on different machines with different questions and clear?

18:49.400 --> 18:50.560
Yes, so look at any of you.

18:50.560 --> 18:57.320
The question is, like, okay, the question is, basically, we build once.

18:58.280 --> 19:06.200
What has been built historically and we compare and this is a weak proof that is actually a positive or so, how confident are we?

19:06.200 --> 19:13.560
And my answer is, if you cannot really prove something is a positive or because you try,

19:13.560 --> 19:23.480
it's much easier to prove something is not a positive or you need one example and prove something is a positive or is not really a positive, but

19:23.480 --> 19:32.280
we do only once, years after on different hardware, even not running in the XOS and we find the same thing.

19:32.280 --> 19:38.080
We think it's good enough proof that this is a positive one.

19:38.080 --> 19:40.280
It makes better canals at back there.

19:40.280 --> 19:48.280
So I think we don't have any more, like this, one thing that I also wanted to add to that,

19:48.360 --> 19:54.600
or a few things, the next people can come and set up their laptop in the meantime.

19:54.600 --> 19:58.120
So there's, you had this talk today, thank you very much.

19:58.120 --> 20:03.480
Another round of applause for the thicker time.

20:03.480 --> 20:05.480
I also want to...

