WEBVTT

00:00.000 --> 00:10.000
How many of you are using an IP here?

00:10.000 --> 00:14.000
So who's annoyed by the message once in a while that you get about?

00:14.000 --> 00:19.000
You need to restart, not need to continue using it.

00:19.000 --> 00:20.000
Nobody's annoyed.

00:20.000 --> 00:23.000
Oh, shit, you're working for nothing.

00:23.000 --> 00:26.000
And so Alex is going to talk about that.

00:26.000 --> 00:29.000
So please sit, remember sit in the middle.

00:29.000 --> 00:31.000
So people that come late can sit.

00:31.000 --> 00:32.000
Thank you very much.

00:32.000 --> 00:34.000
Oh, one last thing for the newcomers.

00:34.000 --> 00:37.000
We exit by the door that are downstairs.

00:37.000 --> 00:38.000
There are stickers on the tables.

00:38.000 --> 00:40.000
Here, here and there.

00:40.000 --> 00:43.000
Please feel free to take them.

00:43.000 --> 00:44.000
Hello.

00:44.000 --> 00:45.000
Welcome to this talk.

00:45.000 --> 00:52.000
So I'm Alex and I have been working for Amazilaf for more than 11 years now.

00:52.000 --> 00:57.000
And I'm here to present you something we call FOX over.

00:57.000 --> 01:02.000
There is not my name on the slide because it has been a walk.

01:02.000 --> 01:05.000
Not just done by me, but it also involved.

01:05.000 --> 01:10.000
Think early, jet Davies, Nikolay Zerl and other people.

01:10.000 --> 01:15.000
So this is what he do with what he was talking about.

01:15.000 --> 01:17.000
How many people have been annoyed by that?

01:17.000 --> 01:20.000
On the daily basis.

01:21.000 --> 01:25.000
It's mostly people on the Linux or on the release.

01:25.000 --> 01:28.000
And let's explain why you see this.

01:28.000 --> 01:33.000
So the reason is that when we create a new process in Firefox,

01:33.000 --> 01:36.000
we have to read from this.

01:36.000 --> 01:41.000
And whatever is on this may have changed because your package manager applied an update.

01:41.000 --> 01:48.000
And so when we start the process, the title says that you need to view a website

01:48.000 --> 01:50.000
and just open the tab.

01:50.000 --> 01:53.000
Then we have the child that is going to communicate with the parent process,

01:53.000 --> 01:55.000
although IPC.

01:55.000 --> 02:01.000
And the first thing that we are going to do is to verify the version that we are talking about of the IPC protocol.

02:01.000 --> 02:04.000
Because if we are not choosing the same version,

02:04.000 --> 02:09.000
we are going to run into very weird issues that we had in the past,

02:09.000 --> 02:12.000
that we are triggering super complicated bugs,

02:12.000 --> 02:14.000
as well wasting our time.

02:14.000 --> 02:19.000
So we decided to say if the version on this does not match the version in memory,

02:19.000 --> 02:22.000
between the parent and the child, then we stop everything,

02:22.000 --> 02:23.000
and we are no people.

02:23.000 --> 02:27.000
Because it's much fun for you to be annoyed,

02:27.000 --> 02:30.000
and for us to be annoyed.

02:30.000 --> 02:35.000
That's the main motivation for the user facing value of FOX server.

02:35.000 --> 02:41.000
So first I will try to explain to you how we go into the situation with so much processes,

02:41.000 --> 02:44.000
and why you run into that issue so often.

02:44.000 --> 02:47.000
Then we will explain a little bit in detail,

02:47.000 --> 02:49.000
but not to mention what is FOX server.

02:49.000 --> 02:54.000
And I will look at the improvements outside of this dialogue,

02:54.000 --> 02:57.000
which should not come anymore.

02:57.000 --> 02:59.000
Let's do a little bit of a story.

02:59.000 --> 03:03.000
A long time ago, the web was very simple.

03:03.000 --> 03:09.000
You had no audio, you had no video, you had computers with only one processor,

03:09.000 --> 03:13.000
as the security was not so much of a big issue.

03:13.000 --> 03:17.000
The attack surface was much simpler.

03:17.000 --> 03:23.000
So the original design of a browser was a single process, doing everything.

03:23.000 --> 03:31.000
Doing UI, rendering of the web pages, doing the disk access, everything, everything.

03:31.000 --> 03:36.000
And there was no good reason to split that into many processes.

03:37.000 --> 03:42.000
Because you had no CPU within our processes to take advantage of it.

03:42.000 --> 03:48.000
Then a few years later, we introduced audio, we introduced video.

03:48.000 --> 03:55.000
And we did that, for example, using Flash, using MPAPI, which some of you might remember.

03:55.000 --> 04:01.000
And those were very complex pieces of software running within the process.

04:01.000 --> 04:09.000
So adding a lot of attack surface to the process, adding a lot of stability issues.

04:09.000 --> 04:13.000
So there was a good incentive to have a separate process for them.

04:13.000 --> 04:18.000
Because, for example, if Flash was behaving badly, we could kill it and restart it.

04:18.000 --> 04:21.000
We started having the world browser to crash.

04:21.000 --> 04:27.000
So it was convenient for user and first convenient for us.

04:27.000 --> 04:31.000
So there was a bug that we would be blamed about.

04:31.000 --> 04:35.000
And the story continues a little bit over.

04:35.000 --> 04:39.000
We had GMP, which introduced some video codecs.

04:39.000 --> 04:42.000
And at some point, the web became a little bit more complicated.

04:42.000 --> 04:45.000
We started to have to isolate stuff.

04:45.000 --> 04:50.000
And with more complexity, we introduced even more processes.

04:50.000 --> 04:53.000
Like GPU content, web extension.

04:53.000 --> 05:00.000
And as of today, everything involved is a process type in Firefox.

05:00.000 --> 05:07.000
So it means that you have around 10 process types that you can have on your computer.

05:07.000 --> 05:12.000
If you go into about processes, you will see a lot of them.

05:12.000 --> 05:19.000
And for example, content processes can be insulated for each type on the browser.

05:19.000 --> 05:22.000
So each time you load the new website, you have a new process.

05:22.000 --> 05:28.000
It's more or less that we have processes to manage access to the GPU,

05:28.000 --> 05:37.000
to take care of the video codecs, to perform audio and video recording,

05:37.000 --> 05:39.000
to perform network access.

05:39.000 --> 05:48.000
And one of the main reason we have that way to put things in two different processes

05:48.000 --> 05:54.000
is security, mostly, because we can perform sandboxing at the process level.

05:54.000 --> 05:57.000
On Linux we perform sandboxing using a second.

05:57.000 --> 06:00.000
So it means that we are filtering system codes.

06:00.000 --> 06:06.000
And one very simple example is that content process does not have access to the file system.

06:06.000 --> 06:15.000
It does do a broker request to the parent process that can verify the file system access.

06:15.000 --> 06:18.000
So it means that we can control even not directly security.

06:18.000 --> 06:22.000
We can for being accessing some stuff for security purposes.

06:22.000 --> 06:28.000
We also use that for fingerprint processing focus.

06:28.000 --> 06:36.000
And recently we also introduced a new kind of process utility process that has a unique property

06:36.000 --> 06:40.000
to allow us to sandbox even more simply things.

06:40.000 --> 06:44.000
Because in the past I think a new type of process was a little bit complicated.

06:44.000 --> 06:51.000
And we wanted to be able to say to sandbox a little bit more, especially for example.

06:51.000 --> 06:56.000
Audio recording that was in the past living within the RDD process.

06:56.000 --> 07:03.000
So it means that we had two shares the same sandbox between video decoders and audio decoders.

07:03.000 --> 07:12.000
And video decoders may require our other access to a specific area of, for example, your GPU to perform audio decoding.

07:12.000 --> 07:18.000
And doing that often means that you have to open holes in the sandbox.

07:18.000 --> 07:21.000
And sandbox you want it to be as tight as possible.

07:21.000 --> 07:26.000
And the only way to sandbox differently is to have different kind of process.

07:26.000 --> 07:31.000
Let's just explain a little bit how we create a process.

07:31.000 --> 07:37.000
So for those of you who are a little bit used to that only mix, you do clone and fork.

07:37.000 --> 07:45.000
And then you do exact and exact is a good thing because it results everything from scratch, which is also a bad thing.

07:45.000 --> 07:48.000
Because it reloads everything from scratch.

07:48.000 --> 07:52.000
So everything we did in one process we have to redo every time.

07:52.000 --> 07:58.000
And we lose all the nice things that we have on fork like copium right,

07:58.000 --> 08:06.000
which means that when you fork a process you should be able to share memory that has not been modified by the child.

08:06.000 --> 08:09.000
And we cannot do that anymore.

08:09.000 --> 08:14.000
So each time we create a new process it has a ghost in CPU time and in memory.

08:14.000 --> 08:18.000
And as you so now we have a lot of processes in Firefox.

08:18.000 --> 08:21.000
And this is why we are for example.

08:22.000 --> 08:27.000
So how do we create a new process with box server? We ask another process.

08:27.000 --> 08:35.000
On Chrome they already have Zigotty. On Firefox OS a long time ago we had a process called new work.

08:35.000 --> 08:38.000
And this strategy is very simple. We start our process.

08:38.000 --> 08:42.000
We get it as far as we can in the startup process.

08:42.000 --> 08:46.000
And we will fork from that point later.

08:46.000 --> 08:50.000
So this way it's faster. We don't have to redo some of the installation.

08:50.000 --> 08:53.000
We can share more memory between processes.

08:53.000 --> 08:56.000
The risk is that now we don't have exact.

08:56.000 --> 09:02.000
So we may get into a state where we are sharing steps that we don't want to share.

09:02.000 --> 09:06.000
Like for example some random generators, which is not a good thing.

09:06.000 --> 09:12.000
Newer was one of the solutions we had on Firefox OS.

09:12.000 --> 09:14.000
It was very complicated.

09:14.000 --> 09:17.000
But completely started content processes.

09:17.000 --> 09:20.000
So the ones that we used to run our web pages.

09:20.000 --> 09:22.000
And it froze them.

09:22.000 --> 09:31.000
And it did a lot of magic to be able to improve them and put them back into a state where it can be used as a new process.

09:31.000 --> 09:33.000
And it was very efficient.

09:33.000 --> 09:35.000
But it was also super fragile.

09:35.000 --> 09:39.000
And I have been myself the victim of a nasty regression.

09:39.000 --> 09:42.000
And I just added a test for Firefox OS.

09:42.000 --> 09:46.000
And we had a very bad execution time regression.

09:46.000 --> 09:53.000
And people blame me when it was in fact a regression on newer.

09:53.000 --> 09:54.000
We spoke several.

09:54.000 --> 09:58.000
We have decided to go with a more simple design.

09:58.000 --> 10:02.000
We do not start a whole content process.

10:02.000 --> 10:04.000
We just start a very minimal process.

10:04.000 --> 10:08.000
But that is not minimal enough to share enough of data.

10:09.000 --> 10:16.000
We do a communication of our socket to perform the call to say,

10:16.000 --> 10:18.000
hey, I want a new process.

10:18.000 --> 10:20.000
And we do not start a JS.

10:20.000 --> 10:23.000
And that is already a bit of changes.

10:23.000 --> 10:27.000
We are to deal with chain model states.

10:27.000 --> 10:32.000
If we do not do that, we completely regressed web content performances.

10:32.000 --> 10:36.000
And I added a bit of with PGO and PIDs.

10:36.000 --> 10:41.000
Basically, when we learned it, alpha of July, we got back to it.

10:41.000 --> 10:45.000
Because we have completely destroyed performances.

10:45.000 --> 10:47.000
Because we just broke PGO.

10:47.000 --> 10:49.000
But that is fine.

10:49.000 --> 10:50.000
And why is it fine?

10:50.000 --> 10:53.000
Because we have been able to relent.

10:53.000 --> 10:57.000
And now it is enabled by default on Linux,

10:57.000 --> 11:00.000
unlike it since end of October.

11:00.000 --> 11:05.000
There is still some work to perform, to be able to enable that on all of the channels.

11:05.000 --> 11:11.000
But as much as I know, there has been nobody complaining about user creation,

11:11.000 --> 11:13.000
which is nice.

11:13.000 --> 11:17.000
So if you are using Nikely on Linux, please have a look.

11:17.000 --> 11:20.000
And maybe we have missed some issues.

11:20.000 --> 11:23.000
And please report them so that we can fix it.

11:23.000 --> 11:26.000
Because this is the wins that we have.

11:27.000 --> 11:34.000
So on those two shots, the first one is the memory usage of content process.

11:34.000 --> 11:41.000
You don't see the figures, but stack of points on the other side is around 15 megabytes.

11:41.000 --> 11:44.000
That's before we are able to focus out there.

11:44.000 --> 11:49.000
And when we are enabled it, we go to 75 megabytes.

11:50.000 --> 11:54.000
So you can also see the times we enabled and got back out.

11:54.000 --> 11:56.000
So 50% improvement.

11:56.000 --> 12:03.000
And here we have a metric that we used to see as a start-up time of a process.

12:03.000 --> 12:08.000
So we went from around 60 milliseconds to around 52 milliseconds.

12:08.000 --> 12:10.000
So 75% improvement.

12:10.000 --> 12:12.000
That's not bad.

12:12.000 --> 12:14.000
And I guess that's all.

12:14.000 --> 12:15.000
If you have question.

12:15.000 --> 12:16.000
Yes.

12:16.000 --> 12:26.000
So any questions?

12:26.000 --> 12:28.000
And I guess we don't have questions.

12:28.000 --> 12:30.000
Thank you Alexander.

