WEBVTT

00:00.000 --> 00:18.000
Hi, some of my name is Pierre, and I'm the current chair of the Apache Login PMC.

00:18.000 --> 00:23.000
I wanted to talk to you a little bit about what happened four years ago.

00:23.000 --> 00:27.000
We've looked for shell to see if we can do something better.

00:27.000 --> 00:33.000
Right now, if we can do provide a solution faster,

00:33.000 --> 00:39.000
I will not tell you that log for shell will never happen again, because that's false.

00:39.000 --> 00:43.000
It's just a question one, it will happen.

00:44.000 --> 00:47.000
So a little bit about log for shell.

00:47.000 --> 00:52.000
Actually there are two versions which are very distinct.

00:52.000 --> 01:01.000
One is log for j1, which was written by check key in 2001.

01:01.000 --> 01:05.000
And while check key soon after Abandoned,

01:05.000 --> 01:11.000
the work in Apache and started the successor for j log back.

01:11.000 --> 01:19.000
And actually log for j2, which was the library affected, was done by another.

01:19.000 --> 01:24.000
An entirely different team, most of these people live in the US,

01:24.000 --> 01:30.000
which is important for the timing of the vulnerability.

01:30.000 --> 01:33.000
I wasn't there.

01:34.000 --> 01:39.000
When the scarp ends, I joined the log for j team.

01:39.000 --> 01:42.000
One month later, I started contributing.

01:42.000 --> 01:48.000
It was popular that there were a lot of small bug reports, it was easy to help.

01:48.000 --> 01:58.000
Yeah, and while since June last year, I'm the chair, the PMC chair of Apache Login.

01:58.000 --> 02:06.000
Okay, so some of you remember the 7th and 9th of 10th of 2021,

02:06.000 --> 02:12.000
it depends which side of this Atlantic are you.

02:12.000 --> 02:15.000
And work happened.

02:15.000 --> 02:23.000
It happened that Apache is more mistake, because many Java applications

02:23.000 --> 02:31.000
to be vulnerable to remote code execution attack.

02:31.000 --> 02:37.000
So to give you an idea of what kind of mistake was it?

02:37.000 --> 02:42.000
Look for j allows you to configure.

02:42.000 --> 02:47.000
There is a pattern layout that allows you to configure how logs are printed.

02:47.000 --> 02:52.000
And it has two kinds of placeholders.

02:52.000 --> 02:58.000
Once are called message patterns with percent, for example, printed date,

02:58.000 --> 03:05.000
the message, and the other are similar to Java properties to what, you know,

03:05.000 --> 03:12.000
for j1 interpolated Java properties, but this system now is extensible

03:12.000 --> 03:14.000
and you have many sources.

03:15.000 --> 03:19.000
Just system properties and environment variables and so on.

03:19.000 --> 03:23.000
So the problem was that when you have such a string,

03:23.000 --> 03:31.000
instead of interpolating the lookups first, it interpolated the patterns

03:31.000 --> 03:38.000
and among the patterns you have the message, that can come from user data.

03:38.000 --> 03:45.000
So when I request arrived to the server, many developers

03:45.000 --> 03:49.000
just what arrived in the message.

03:49.000 --> 03:52.000
And so the message is expanded.

03:52.000 --> 03:56.000
The message can contain lookups and the lookups are evaluated again.

03:56.000 --> 04:01.000
And there was a lookup that was particularly dangerous,

04:01.000 --> 04:07.000
which was GNDI, I think that up to that moment,

04:07.000 --> 04:14.000
except all the Java developers, nobody remembered that you can use GNDI

04:14.000 --> 04:20.000
to download stuff from an LDAP server and execute locally.

04:20.000 --> 04:32.000
So the combination of this to fact came affected millions of applications worldwide.

04:32.000 --> 04:38.000
So the flow, this inversion, was discovered actually in 2014,

04:38.000 --> 04:43.000
not 21, but unfortunately, it was classified as a feature.

04:43.000 --> 04:47.000
And we are very committed to backward compatibility.

04:47.000 --> 04:58.000
So if somebody was using that feature, we gave him an option to opt in our app.

04:58.000 --> 05:03.000
So how many days did it take to solve this problem?

05:03.000 --> 05:08.000
It took from November the 24th to December the 9th.

05:08.000 --> 05:11.000
November the 25th was Thanksgiving.

05:11.000 --> 05:19.000
So in the United States, let's remove a couple of days of work

05:19.000 --> 05:24.000
because everybody was with their families.

05:24.000 --> 05:30.000
Another characteristic of this timeline is that unfortunately,

05:30.000 --> 05:36.000
we published a patch on GitHub on November the 30th,

05:36.000 --> 05:46.000
which made the patch available for users for 9 days.

05:46.000 --> 05:53.000
Another problem here you can see is that the release vote,

05:53.000 --> 05:57.000
the first release vote in Apache,

05:57.000 --> 06:01.000
we have normal release votes are 72 hours.

06:01.000 --> 06:06.000
But it's written everywhere 72 hours, 72 hours.

06:06.000 --> 06:11.000
So most projects being that it must be 72 hours.

06:11.000 --> 06:17.000
Actually, my colleagues discovered later that it doesn't have to be.

06:17.000 --> 06:24.000
So it can be 8, but every member of the PMC must know that there will be a vote.

06:24.000 --> 06:26.000
So this is a problem.

06:26.000 --> 06:32.000
So you need 72 hours to tell people that you will vote for 8 hours.

06:32.000 --> 06:38.000
Okay, so this is the timeline that we can correct.

06:38.000 --> 06:41.000
We can reduce because it's our timeline.

06:41.000 --> 06:48.000
Another timeline that I don't know how to improve is this one.

06:48.000 --> 07:00.000
This is from Sonataip, the follow the downloads of vulnerable versions of Lockforge.

07:01.000 --> 07:16.000
And the timeline ends in July, 2024 with around 10% of downloads were for vulnerable versions.

07:16.000 --> 07:24.000
If you account that most big companies have proxies, so they don't download it every time,

07:24.000 --> 07:29.000
well, it's a lot.

07:29.000 --> 07:35.000
And the newest, as you will state, of Java report that I read before,

07:35.000 --> 07:43.000
for them says that still today 49% of responders say they are affected by Lockforge.

07:43.000 --> 07:48.000
Vulnerabilities, I must ask them precisely what does it mean,

07:48.000 --> 07:53.000
because we didn't make any Vulnerabilities in the past three years.

07:53.000 --> 08:07.000
Okay, so this is the summary of how the Lockforge was solved.

08:07.000 --> 08:12.000
And let's see if we can do something better.

08:12.000 --> 08:24.000
So after this problem, we tried to identify what are the main problems to reusing fast fixing.

08:24.000 --> 08:27.000
One was the tests were flaky.

08:27.000 --> 08:31.000
So there was a big chance that I read this build face.

08:31.000 --> 08:35.000
Sige generation was not slow, but hyperslow.

08:35.000 --> 08:37.000
It was a couple of hours.

08:37.000 --> 08:40.000
Do you the technology that we used?

08:41.000 --> 08:44.000
It was difficult to make a release.

08:44.000 --> 08:52.000
So we had two people that had the time new, how to release for Jay.

08:52.000 --> 08:58.000
We identified also a problem that we had to many features.

08:58.000 --> 09:02.000
And they were all included.

09:02.000 --> 09:07.000
And what, some minor documentation problems.

09:08.000 --> 09:12.000
So there were two companies that had solved this.

09:12.000 --> 09:18.000
One was tied leaf, that started supporting Lockforge in January 23.

09:18.000 --> 09:21.000
And then with Christian and Vulnerability,

09:21.000 --> 09:34.000
we got a big sovereignty grant that allowed us to work for 60 months just on a project.

09:34.000 --> 09:39.000
So Vulnerability and Vulnerability are not involved.

09:39.000 --> 09:44.000
So what did we achieve?

09:44.000 --> 09:48.000
What did we still need to achieve?

09:48.000 --> 09:56.000
So as I was saying, releasing the project was very complex because you had to run everything on your machine.

09:56.000 --> 10:03.000
We had a CI, but the CI was not allowed to make releases and was not to put a test.

10:03.000 --> 10:09.240
was not programmed to make releases, so all the stuff was on the release manager and there was

10:09.240 --> 10:16.840
one candidate that could do it. Nowadays everybody can do it, so if I'm here I've

10:16.840 --> 10:22.760
forced them and something happens and you have a PNC member can do it. So we managed to

10:24.680 --> 10:32.680
totally automate the first part of the release up to the vote, so the release manager needs

10:32.680 --> 10:39.800
just to do stuff that you must do. Prepared release notes decide the work to cut the release.

10:41.160 --> 10:46.760
There are still a lot of manual things like publishing to next source then publishing

10:47.720 --> 10:54.360
to the subversion repository publishing the website. These are not handled automatically.

10:55.160 --> 11:00.280
There is a project inside. In Apache Apache Trust, this platform and we believe

11:00.280 --> 11:07.080
in a couple of days, years, they will do that for us. So we will just press a button,

11:07.080 --> 11:13.800
the votes are there, release the cracker. Okay, what are the most important parts

11:14.760 --> 11:22.760
projects that help us with this automation? As I said before, we released on our own computers

11:22.760 --> 11:28.920
because we need a trusted platform to make release so that nobody can mess with it.

11:29.880 --> 11:35.000
Thanks to the reproducibility, the reproducibility will be its project,

11:36.280 --> 11:44.200
ever thank you. We now have reproducibility which means that the CI can make the build

11:45.000 --> 11:49.960
and we can verify it locally. So once we verify it locally, it's okay. We can release it.

11:50.920 --> 12:04.600
Okay, depend about updates, our dependencies, GitHub actions is what we chose for CICD that

12:05.640 --> 12:16.040
before we will use some figures. Of course, a lot of maven planks and test libraries that you never

12:16.120 --> 12:23.880
heard hearing talks. They are removed from its bombs. They are removed from everywhere,

12:23.880 --> 12:31.800
but without these things, we are not able to publish fast. So right now, we can publish in half an hour.

12:34.360 --> 12:40.200
I had already scented it for another conference, so at the conference, I showed it how it

12:40.200 --> 12:47.320
happens here. We are in the middle of solving some problems, so I cannot make a release. The second

12:47.320 --> 12:58.360
problem was, as I said, the test. So we had only a sequential test, unit test, which were

12:58.360 --> 13:06.920
unit 3 to unit 5 and there was a lot of flakey tests. So that was the problem. So you make

13:07.000 --> 13:12.200
unit to make a wrist fast and you have a test failing and you need to analyze if it's

13:12.200 --> 13:18.920
the flakey test or something Korea. So nowadays we switch to parallel tests.

13:21.480 --> 13:25.400
We added dynamic tests, which was needed for us to have the

13:25.720 --> 13:36.920
open SSF best practices. Mark, there are still flakey tests that's something that's

13:38.200 --> 13:46.440
if you want to help please help us. It's always good. We had a present from STF that paid

13:46.440 --> 13:53.480
another company to convert everything to unit 5. So now we are unit 5, there are a little bit more flakey

13:53.560 --> 13:59.800
tests that we are working on that, but the build takes half an hour with the website.

14:01.720 --> 14:12.600
It was a little bit more hard. A lot of discussions on how to handle features because

14:13.880 --> 14:22.440
one of the important things is that, okay, if you are a maintainer, you are allowed to

14:22.440 --> 14:32.440
patch your feature, right? Well, this is a big question. So no, we felt and we even

14:33.480 --> 14:43.800
in look for J3, we did a clean up. We modularized many features. So if they are affected by vulnerability

14:43.800 --> 14:48.840
and you don't have on your stack, it's not your problem. We removed a lot of features

14:49.800 --> 14:57.640
that nobody was using. Even if a PMC member contributed to the feature, maybe he was not using it

14:57.640 --> 15:06.440
right now. And now we have a strategy to accept in feature. Look for the modularized, you can publish

15:06.440 --> 15:16.680
your own Git repo, publish the feature in a jar. If it becomes popular and you need an

15:16.760 --> 15:29.880
Opus on Stuart, you can bring it back to your project. Okay. Then so that was what helped us improve

15:29.880 --> 15:37.000
our internal processes. There is also the problem how to say to people that we have a vulnerability.

15:38.120 --> 15:46.440
So right now, as bonds are very, very popular, but honestly for us, we are a library. So we don't

15:46.520 --> 15:53.400
decide what people are putting in their applications for vulnerability handling as bonds

15:53.400 --> 15:59.800
for our project that it doesn't do anything. What it could do and what we implemented

16:01.080 --> 16:06.680
is you can link from an as-bomb other important stuff. So for example, we link a video

16:07.560 --> 16:14.760
which is the list of our vulnerabilities described as we would have done it. So it doesn't come

16:14.840 --> 16:32.280
from the NVD. It's right from a humanitarian video. That's what we're able to do up until now.

16:33.560 --> 16:39.880
In the near future, in this month I'm working on some tools that will use these as bonds to

16:39.880 --> 16:49.080
download the videos, to check the compatibility between the as-bomb of the project and what is

16:49.080 --> 16:59.080
published in the dependencies and so on. But of course, the ultimate solution is what you will hear

16:59.080 --> 17:05.720
from all that tomorrow in this room, I think, T transparency exchange API.

17:06.200 --> 17:15.400
As soon as T is ready, we will try to publish videos, which versions are supported, which are

17:15.400 --> 17:24.520
not through this interface. Okay, and the last thing we did is there are some things that we

17:24.520 --> 17:36.440
cannot fix, but because depending on how users use a project, they can do it in an unsafe way.

17:37.240 --> 17:47.720
One of the problems is, look for J allows you to have placeholders for parameters or string

17:47.720 --> 17:56.360
concatenation. You can choose if you want string concatenation or pass the parameters separately.

17:56.360 --> 18:05.080
If you do both together, you can have some my logging injection because if users contains the

18:05.080 --> 18:13.720
parameter, then the parameter will be printed here and not there. There's nothing we can do it.

18:13.720 --> 18:24.760
So the documentation is also important to tell users how to save the user for J. Okay, so now I have

18:26.040 --> 18:35.080
a new proposed timeline when lock 5 shell will start. So on the 0, we always start to answer

18:35.160 --> 18:43.000
to reports in 24 hours. We request the CV number as soon as we have the CV number. We request

18:43.000 --> 18:50.760
a private G3 point, we didn't know about the possibility. So the patches will be full git, but you will

18:50.760 --> 18:59.880
not see them. And the problem with the voting procedure, we will start not a vote, but a consensus

18:59.960 --> 19:09.080
to shorten the vote. So all the PMC members, which are all over the world, have 72 hours to agree

19:09.080 --> 19:18.760
to make a 24 hours vote, for example, 8 hours to decide, okay, we will release at this time

19:19.080 --> 19:28.680
a piece vote. Then we will not try to keep the feature at all costs. So the first thing we do

19:28.680 --> 19:35.960
is to Java lines of code removing the feature, if nobody comes up with and better solution,

19:35.960 --> 19:43.720
that's what will be in the patch. Because in the end, after lock 4 shell, strongly limited in

19:43.720 --> 19:56.120
G&GI was the final solution. So on day 5, maybe 6, we can already prepare the release,

19:56.120 --> 20:05.400
we don't need a separate date for that because it's half an hour. During the vote, if it's 24 hours,

20:05.480 --> 20:14.600
if it's 72 hours, we ask the project that depends on lock 4G to help us, very data thinking,

20:14.600 --> 20:22.840
it's and running their tests. But even if it's short, it should be possible to automate the

20:22.840 --> 20:31.720
process to seeing that there is a vote, you know, for J testing it, so you can say that. So you can

20:31.720 --> 20:38.760
already say my application works with the new version and not for J can say, okay, there were

20:38.760 --> 20:50.600
no bugs introduced. And on day 7, we have the announcement. Okay, so we reduce it a lot.

20:51.240 --> 20:59.240
All right, and smooth up. Any questions?

21:02.120 --> 21:06.520
Thanks for the presentation, we have three minutes for the announcement.

21:20.600 --> 21:37.720
Any questions? We have unit tests, but the open SSF, the question was about the dynamic tests that

21:37.720 --> 21:46.200
we added. So we had unit tests, we didn't have fuzzing tests, we introduced fuzzing

21:46.280 --> 21:54.840
during the SDF contract. So that's the best practice that opens SSF requires if you want

21:54.840 --> 22:01.880
the best practices, Mark.

22:01.880 --> 22:06.280
Do you have any solution that customers don't know of the older version of the

22:06.360 --> 22:14.120
universe? What one of the main problems of Locker J that was? Yeah, I thought a lot about that,

22:14.120 --> 22:23.720
but yeah, maybe I didn't say it, but if you looked at the graph, many customers in the first

22:23.720 --> 22:29.800
month, there were three reasons and there were many people that downloaded all three. So there

22:29.960 --> 22:37.000
is a part of the industry that's conscious, security conscious. The rest, I think the solution is

22:37.000 --> 22:47.800
CRA and fines, but I don't know. I thought a lot about that, or maybe we could support something

22:47.800 --> 22:55.160
for 10 years, but yes, we support this version for 10 years, but as soon as we have a patch release,

22:55.240 --> 23:05.560
you need to upgrade. Thanks for the very interesting talker. I have a question,

23:05.560 --> 23:10.840
because I mean, the Locker J case was very popular. Yeah, many lessons learned that you

23:10.840 --> 23:17.960
think they shared with us. I don't know where, like, if this lesson there were propagated

23:18.040 --> 23:26.760
to other projects in the world foundation, is it? Well, actually, Locker J was one of the first

23:26.760 --> 23:37.960
that's automated the build. I don't know if it's a direct, I direct cause effect, but now we

23:37.960 --> 23:44.760
have this project, a patch trusted releases, so we have one person that works full time on this.

23:47.960 --> 23:52.440
Well, we try to spread the world for conferences.

