WEBVTT

00:00.000 --> 00:15.200
Hello everyone, thanks for coming.

00:15.200 --> 00:20.460
In this talk, I will explain you how we improve the performance on some double-valor

00:20.460 --> 00:29.820
operation maker by a lot with simple things.

00:29.820 --> 00:37.820
So macroeal, if you don't know what it is, which is most people, it's a distributed source

00:37.820 --> 00:41.100
control tool, like Git, same generation.

00:41.100 --> 00:44.660
It's cut-base is in Python, C and Rust.

00:44.660 --> 00:51.180
The cut-base is 20 years old, because the tool is 20 years old, and I'm a contributor

00:51.180 --> 01:00.140
to it since 2010, and I'm now one of the maintenance of it.

01:00.140 --> 01:02.420
So let's talk about data compression.

01:02.420 --> 01:03.700
We're doing a fashion controls.

01:03.700 --> 01:10.900
We have multiple version of the same file, and the most naivety-store, multiple version

01:10.900 --> 01:16.900
of the same file, and it's to store multiple version of the same files as full text.

01:16.900 --> 01:20.980
However, this is not very efficient in terms of space.

01:20.980 --> 01:27.340
So one way to solve that is to use delta compression, where you store a full version of your

01:27.340 --> 01:31.940
text initially, and then you compute a small difference, and you store a delta which

01:31.940 --> 01:37.340
is just a small edit instruction to get the second version of the file for the first one.

01:37.340 --> 01:42.260
Then you can do another delta that apply on the second one, and let's change of delta

01:42.260 --> 01:45.060
that is going to have a pretty good compression.

01:45.060 --> 01:50.140
However, if your chain of delta is endless, you're going to have a bad time when you try

01:50.140 --> 01:53.980
to restore the content, because you have to do a lot of preparation to get there.

01:53.980 --> 02:00.100
So there is way to solve that, like doing full snapshots of the content from time to time.

02:00.100 --> 02:04.740
So you can still get a good compression, but you can also control the amount of time

02:04.740 --> 02:09.340
and that I need to restore your content.

02:09.340 --> 02:14.980
And of course, it's not as simple even for one chain, because it's distributed version

02:14.980 --> 02:15.980
control.

02:15.980 --> 02:20.580
So you have bunching, merging, and the version of your file are not just linear.

02:20.580 --> 02:22.820
You have like, branches.

02:22.820 --> 02:27.460
So you end up building more tree, where you have delta that applies.

02:27.460 --> 02:32.020
Again, other delta, not necessarily linearly, so you have other question regarding memory

02:32.020 --> 02:38.820
density and thing like that, but you have a better compression, good restoration time.

02:38.820 --> 02:50.700
And you can get even more complicated, as you try to solve more and more problems.

02:50.700 --> 02:54.620
Eventually, like right now, in my career, you have something where you have this full snapshots,

02:54.620 --> 02:59.580
you have delta trees, and then we collapse some of the delta, but not full text.

02:59.580 --> 03:06.540
And so we have also a tree of intermediate snapshots, and more delta tree on them, and it gets

03:06.540 --> 03:10.460
complicated, and I'm not going to dive too much into the details of that.

03:10.460 --> 03:15.020
It's worse here of like, computer science, academic paper, that we are actually working on.

03:15.020 --> 03:18.900
So I'm not going to dive too much into that.

03:18.900 --> 03:22.140
So material use delta encoding, and it does that on the fly.

03:22.140 --> 03:26.260
So when you do a commit or when you receive revisions, you're going to do the delta encoding

03:26.260 --> 03:27.900
at that time.

03:27.900 --> 03:31.100
Which means that you don't need maintenance operation like feedback.

03:31.100 --> 03:36.980
It's a important part of the material design, and how you get it to scale.

03:36.980 --> 03:39.900
It can actually be very good compression for some content.

03:39.900 --> 03:46.860
The nice case we like to show is like, we have a file that has so many versions that

03:46.860 --> 03:51.860
all the full text would be one petabyte, and we can store them into like a bit less than

03:51.860 --> 03:58.180
20 gigabyte, which is like 50,000 compression, which we are happy about.

03:58.180 --> 04:04.140
And the nice thing with the delta is that the delta you computed on one repo, you can

04:04.140 --> 04:09.620
basically exchange them as is for another repo that is going to be able to apply them

04:09.620 --> 04:16.820
as is, and it's mostly the case, but it's not always the case, because sometimes you

04:16.860 --> 04:23.660
have some delta that are going to be weird, because we are cliented them, sometimes

04:23.660 --> 04:28.700
you don't take action to snapshot, and you also have server that are in our annoying mode

04:28.700 --> 04:32.540
that are going to be compute everything to make sure that they have the base to rage, because

04:32.540 --> 04:35.060
they don't really treasure the client.

04:35.060 --> 04:40.820
And that can grow up to be a significant preference problem, because computing a delta

04:40.820 --> 04:45.860
is kind of expensive, and so when you extend that, you can underpine the situation where

04:45.940 --> 04:51.340
most of the time, you spend doing delta computation instead of actually like, it's

04:51.340 --> 04:58.620
like something that out, but, and it takes a lot of time, because computing delta is slow,

04:58.620 --> 05:05.340
and it's slow, because this thing is an older n-square kind of problem, so of course it's

05:05.340 --> 05:12.860
like a conceptually slow, and it's a fundamental problem, so the solution to walk around

05:12.900 --> 05:18.660
this fundamental problem of dipping is slow, that you have fancy ID, you can create

05:18.660 --> 05:22.660
better delta chain, that is going to be better delta 3, that are going to be easier to

05:22.660 --> 05:28.300
exchange, you can have a richer exchange so that the server and the client better collaborate

05:28.300 --> 05:34.060
to create a good tree, and you can do a lot of refructuring and our deep computer science

05:34.060 --> 05:38.660
is going to mean that you're going to do less dipping, and so you're going to have less

05:38.740 --> 05:44.020
preference problem from dipping, so I did a bunch of that, and when you we walk the

05:44.020 --> 05:47.940
tree, it means you have to re-compute the delta for a lot of things, and for huge

05:47.940 --> 05:52.740
people, it can take like a couple of days, and it's kind of slow and annoying when you're

05:52.740 --> 06:00.140
trying with a two-day feedback loop, and so I'm sitting there, and I start asking me a

06:00.140 --> 06:07.140
question that I should ask myself a long time ago, like, why is dipping slow actually

06:07.140 --> 06:15.940
in practice when I run it, so I can take a profiler and look at what is the

06:15.940 --> 06:19.940
computer actually doing, and I realize that it's mostly doing decompression, because

06:19.940 --> 06:24.780
the delta and the full-text on this are compressed, and we have a cache for not

06:24.780 --> 06:29.100
tweeting the data over and over, but we don't have a cache for not decompressing

06:29.100 --> 06:35.020
data over and over, to impractice we're doing the very same operation in loop over and

06:35.020 --> 06:41.180
over, and that's kind of sealing, so there is simple solution from that, we had a small

06:41.180 --> 06:47.260
LSU cache, and it's already significantly faster. After that, we realized that we spend

06:47.260 --> 06:53.740
most of the time competing with Shawan, because in Miyakayal when you restore a content,

06:53.740 --> 07:00.060
like you want to get the full version of a file, typically to write it on disk, but also

07:00.060 --> 07:07.180
when you want to get a version of a file to defeat it again another file, they default your

07:07.180 --> 07:12.700
compute it's Shawan, and it's not really useful here, because when you receive the delta

07:12.700 --> 07:18.700
that you reuse as is, you're not checking the Shawan of the full-text at that time, you're

07:18.700 --> 07:23.500
going to check the Shawan of the full-text when you actually restore it, and so why do you

07:23.500 --> 07:30.140
check Shawan of content with an expensive operation, only a few percent of the time, it doesn't

07:30.140 --> 07:38.300
give you any future compared to not doing it at all, so we can still do it when we do these

07:38.300 --> 07:45.340
data computation, and it seems quite significantly, and the profile again showed a compression,

07:45.340 --> 07:51.580
because when we wrote that, we forgot to cache the data we've write on disk, so when we

07:51.660 --> 07:56.140
receive a delta, and it's good, we've got it on disk, and then we add to, like we compress it,

07:56.140 --> 07:59.740
we've got it on disk, and then we add to we did it from this, they compress it and put it in the cache,

07:59.740 --> 08:09.660
which is kind of silly, so we think that too, and at that point, I finally see the different part

08:10.060 --> 08:20.620
in my profile, already speeding up by a few x solving very simple thing, and we can now ask

08:20.620 --> 08:26.060
ourselves the question, do you think the differing is because the algorithm is elsewhere,

08:27.340 --> 08:33.260
or do you think it's because the differing code has been written in 2005, and it's just doing

08:33.260 --> 08:39.660
everything manually, or you think there is more low-ungined fit, like if you think this is the algorithm

08:39.660 --> 08:47.180
raise your end, okay, if you think this is the secret raise your end, if you people, if you're

08:47.260 --> 08:53.660
long in fit, more people, okay, congratulations, you're right, surprisingly,

08:56.060 --> 09:01.500
when we do use the differing, we have to get the text we want to this, then there is an optimization

09:01.500 --> 09:06.540
that's fine, the common part, without doing differing, like at the beginning, in certain things,

09:06.540 --> 09:10.620
and then we do that real differing on the, on the part, we know is different,

09:10.700 --> 09:19.340
typically we do memory comparison to detect the common prefix, and after that we have to

09:20.060 --> 09:25.420
ash lines so that we can do actual differing algorithm, and so if you do simple memory

09:25.420 --> 09:31.580
comparison for the prefix, you keep a lot of line-hashing and you simplify the size of the

09:31.580 --> 09:35.980
differing algorithm, and it's good, so for a long time, we have been doing

09:35.980 --> 09:42.780
prefix comparison to speed things up, but we have never done prefix comparison before, so you can

09:42.780 --> 09:49.100
just take the cut for the prefix check, and you put it for also the prefix check, and again,

09:49.100 --> 09:56.220
you have a pretty significant speed up, because you are just doing the same thing over, so at that

09:56.220 --> 10:02.540
point, depending on the workload that we are, I kind of already five time faster for, like,

10:02.620 --> 10:08.700
ingesting data where we compute the diff, and we really didn't do anything fancy, like,

10:08.700 --> 10:15.100
all the parts are super simple, because either you do less work, that you didn't need to do,

10:15.100 --> 10:20.460
or you do work that you already did elsewhere, and it's, you have a performance gain,

10:21.180 --> 10:27.660
we didn't change the language used to implement anything, we didn't actually use any of the

10:27.660 --> 10:34.540
fancy AD, because the five X is on the older version, like, without the fancy AD, and we didn't

10:34.540 --> 10:41.740
add to refactor or anything round or change anything in terms of active architecture, and yet,

10:41.740 --> 10:50.140
we save, like, 80% at the time already, so it's kind of these are pointing in the match of

10:50.140 --> 10:54.460
mass, against just looking at what is going on in practice, and not doing stupid things.

10:54.780 --> 11:03.580
So, now, the diff code is taking the most of the time, and that diff code is secured from canal

11:03.580 --> 11:09.580
developers, so it's probably good in terms of efficiency when it was written, but it was written

11:09.580 --> 11:15.260
20 years ago. It does everything manually, because it doesn't have any dependency that we,

11:15.260 --> 11:21.580
it doesn't use any optimized work from anyone, it doesn't have sweat safety, because it's in

11:21.580 --> 11:26.460
C, it's not really relevant for performance right now, but it's definitely useful for other stuff,

11:27.740 --> 11:36.620
and it's looping over things by, by, by and thing like that. So, I just move that piece of code

11:36.620 --> 11:43.180
to rest, I used an existing diffing library, because I didn't want to sing too hard about

11:43.180 --> 11:52.380
diffing for now. The comparison for the prefix and suffix, I can write normal code, and it's

11:52.380 --> 11:58.780
going to be turned into SIM decode when you compare it from rest or dramatically, like, I don't

11:58.780 --> 12:04.540
actually call SIMD myself, I just say, conferred this thing blog by blog, and rest is going to do

12:04.620 --> 12:12.860
to do the writing. I get sweat safety in the process, and in practice just doing that is giving

12:12.860 --> 12:21.580
me a preference boost and the dipping part, between 1.5 and 5x, just because mostly just because

12:21.580 --> 12:27.980
the SIMD and more efficient generation from rest. I'm not even sure the dipping library is that

12:27.980 --> 12:35.100
smart and that preference, but at that point it's not really the main part of why we're spending

12:35.100 --> 12:48.140
time in the dipping. At that point, we move from this to this, where, getting the text, and then

12:48.140 --> 12:56.460
I truly working the memory is most of the time. So, let's think a bit about what happens,

12:56.620 --> 13:08.780
like, let's open the box about getting the text. So, I talked about before the fact that we have

13:08.780 --> 13:16.540
the full text and then a list of delta. If you were to just take the full text, apply the first

13:16.540 --> 13:22.060
it to get another full text and then apply the other delta to get another full text. It would be

13:22.060 --> 13:29.340
quite inefficient, where you could be memory over and over and think like that. So, we don't do that

13:29.340 --> 13:37.820
in reality all the time. We do something like delta folding and we do it in 2x2 because it seems

13:37.820 --> 13:45.420
a problem. So, we take this to patch just by looking at the metadata, we get a bigger patch and

13:45.420 --> 13:51.500
2x2 and kill, we just have a large patch that we have to apply to the full text and that's more efficient.

13:52.780 --> 13:59.260
And our efficient is that's called actually, but just looking a bit at what is happening

13:59.260 --> 14:06.700
with it. We rely that we can be a bit more efficient by pre-processing the delta before doing the

14:06.700 --> 14:12.060
processing because we get more like if we gather the metadata together, we have a better memory

14:12.140 --> 14:19.340
locality, we skip random access. And then when we have this tree of things that we

14:19.340 --> 14:28.860
merge together over and over, we can pre-size the vector so that we know we have that much element

14:28.860 --> 14:34.460
on one side and that much element on the other side and so we have a ballpark estimator, our

14:34.460 --> 14:39.020
big is going to be the result. So, instead of starting with a small vector, we can pre-size the

14:39.020 --> 14:47.500
vector or to avoid the cost of residing it over and over. And we can limit allocation by using the

14:47.500 --> 14:52.700
delta that we know your needs to build the vector from the delta we know your needs to build the

14:52.700 --> 14:59.340
new ones. Like, there is nothing fancy in it. Like, this is kind of basics, performance stuff,

14:59.340 --> 15:06.860
you would get from any compound things, but they were not done in the initial code. And part of

15:07.180 --> 15:14.700
that is, when someone we've got parts of material in rest, usually just rewarding it in rest,

15:14.700 --> 15:21.820
make it significantly faster and so they stop there. But at some point, we could go back and

15:21.820 --> 15:26.620
look at what they did and actually realize that there is a bunch of long-infits for optimization

15:26.620 --> 15:31.500
that go there. So, doing this, the delta folding is going to be significantly faster.

15:32.460 --> 15:39.420
So, we are really now in the deep thing. Like, we have a full text, we need to find the

15:39.420 --> 15:49.420
command part and we need to compute the deep from it. So, let's look back at the delta chain.

15:50.060 --> 15:56.460
In practice, when you have two content, the content A and the content B, there is a good chance

15:56.460 --> 16:01.900
that they share a common part in their delta chain because in practice, the delta chain

16:01.900 --> 16:10.300
are like, which people unreads a delta long. And so, we can look at, oh, there is a lot of things

16:10.300 --> 16:15.260
in common and the small things that are exclusive to A and the small things that are exclusive to B.

16:16.460 --> 16:22.060
And we know how to fold things, so, instead of folding everything on each side, we can do

16:22.140 --> 16:27.980
these kind of things where we have a delta for all the command parts, a delta for all the

16:27.980 --> 16:33.820
exclusive part of A and the Italy's part of B. Just using tool, we already have and we already use.

16:35.260 --> 16:43.340
This means that after that, we can see what it would take like the full text exist.

16:43.980 --> 16:49.660
But we have a delta for A, we just made that out about when it stopped touching the base

16:50.460 --> 16:56.220
and made that out about when it stopped touching the base. And the same for B. And so, we can

16:56.220 --> 17:03.740
know in the full text, in the base full text, more the common part, when we're going to start

17:03.740 --> 17:12.060
touching it and when we're going to stop touching it, which means that we know without having to look at

17:12.060 --> 17:19.660
a memory or whatever, that this part is the same. And this part is the same. And they could

17:19.660 --> 17:24.380
put, we could probably find part that are the same in the middle, but there is some ambiguity in the

17:24.380 --> 17:28.620
diff, and so, let's not get into that now. We're just doing the same thing, but.

17:30.620 --> 17:37.420
So, looking a bit at the data we have, we can find the command section just walking on the

17:37.420 --> 17:44.700
middle and now from the delta themselves. So, we could be less memory, because we don't need to

17:44.700 --> 17:50.780
be the memory of the common parts. And we walk less memory, because we don't need to walk that

17:50.780 --> 17:58.380
memory to realize that it's the same. We already know it. And another important part is that the

17:58.380 --> 18:03.820
section we are going to run the wall algorithm on and the wall dipping on is significantly smaller.

18:04.140 --> 18:11.100
And being significantly smaller means that it is significantly more cash friendly and more efficient

18:11.100 --> 18:24.460
overall. There is something that I didn't say so far. Most of the time we spend

18:24.460 --> 18:29.580
consulting delta in order to provide, I was looking at, is about the material of manifest.

18:30.300 --> 18:37.660
The material of the manifest is equivalent to git trees. It store the state of a revision for

18:37.660 --> 18:46.060
every commit. But it's done as a list of full paths and the ash for the version of that file.

18:46.060 --> 18:52.620
It's not a recursive tree of the directories. It's just the full list of every file in the revision.

18:53.180 --> 19:00.620
So, it's significantly bigger, because if you have 1 million file in your commit, then you

19:00.620 --> 19:05.500
will have 1 million file in the manifest. It has some advantage to have a flat structure and some

19:05.500 --> 19:11.180
disadvantage, like you have a 100 megabyte of things to deal with when you do all this computation

19:11.180 --> 19:16.940
and all this restoration. However, it has different properties than just generic

19:16.940 --> 19:22.860
dipping that we were doing so far. So, all the work we did so far, benefit everything. Even if

19:22.860 --> 19:29.100
it's the manifest, it's the files, even if it's pretty much all-strong, we still. However,

19:29.100 --> 19:38.380
the manifest is quite quite different. It's a short list of unique item, which means that

19:38.380 --> 19:44.940
we can have, there is a lot of files and the delta actually small, because people, if you have

19:45.020 --> 19:50.460
a million files, people don't do commit that commit long files all the time. They only touch

19:50.460 --> 19:56.460
maybe 3, maybe 10, maybe 100, even 100 file, it's not that much. Even the 1,000 for 1 million,

19:56.460 --> 20:05.420
it's small. It's a small element. It's this kind of pattern, like last content and small delta,

20:05.980 --> 20:11.260
apply to others to not just the material manifest. We see this in config file and packaging

20:11.260 --> 20:16.540
definition, so having better performance here helps a bit more people. But we're going to

20:16.540 --> 20:25.100
focus on the specific manifest here. Because it's a short at least, we don't need to do the

20:25.100 --> 20:31.340
generic end square dipping when we do thing. We can just walk the thing in order and as a

20:31.900 --> 20:38.940
you see something on one side and not on the other or you see something different. You can just

20:38.940 --> 20:45.500
walk your content and you have a diff that is much simpler to produce. At that point, specializing

20:45.500 --> 20:54.220
the manifest code is more efficient. We have the linear time dipping. The common part that we see

20:54.220 --> 21:02.540
when we look at the metadata are unambiguous, because there is no, something will be before or after,

21:02.540 --> 21:08.380
there is no ambiguity about that. So we can leverage the common parts that we detect from

21:08.380 --> 21:15.740
the metadata even more. Which means that we don't need to restore content to run generic diff.

21:15.740 --> 21:21.500
We can just look at the individual delta directly and have a zero-copy dipping except for producing

21:21.500 --> 21:32.780
the delta. At that point, the delta computation that was the majority of the time when you

21:32.780 --> 21:41.900
do a pro that goes wrong is now a small part of the pro. It's still sharp in the profile,

21:41.900 --> 21:47.260
but there are a lot of other things like opening files, the Python logic that drives all the

21:47.260 --> 21:54.140
delta's writing that I'm disk. And spinning that further is not going to give

21:55.180 --> 22:00.380
large speed up. All these things, I already know that there is silly things

22:00.380 --> 22:05.660
and long-interest that we could do and also more advanced things we could do to speed them up.

22:05.660 --> 22:11.580
But it's not what I'm here for. It's more okay, delta computation seems to be a solved problem.

22:12.460 --> 22:19.500
And when I see a solved problem is, if I take a quite extreme case of pulling 1,000 revisions

22:19.500 --> 22:27.820
in a modular repo with the old code and the best case scenario where you can reuse as many

22:27.900 --> 22:39.900
delta as you can. I mean, we move from about 8 minutes to about 2 minutes, which is

22:39.900 --> 22:48.540
over you for x on the top level operation. And if we're in the paranoid case where you want to

22:48.540 --> 22:56.220
recompute absolutely everything, we were about 4 hours before and we're still about 2 minutes now,

22:56.300 --> 23:02.700
which is 1 under a time faster. Basically, the problem is no longer really a problem,

23:03.260 --> 23:15.660
especially if you look at the fact that all the, if you look at this number, which is a bit less

23:15.660 --> 23:21.740
than 2 minutes. And this number, which is a bit more than 2 minutes, you see that there is

23:21.820 --> 23:28.780
pretty much no difference. We're computing absolutely all delta and it's not 1,000 delta. It's

23:28.780 --> 23:36.140
1,000 delta for the manifest and 1,000 and there is file, so there is more than 1,000 delta

23:36.140 --> 23:42.700
to compute over all. And the delta is like kind of no longer in the picture anyway. That's why

23:42.700 --> 23:50.780
also could develop in the profile at that point. It doesn't mean that we could not do the delta

23:50.860 --> 23:58.460
faster. The way we fold them, we already know we could have more efficient algorithm. The way

23:58.460 --> 24:04.940
we start the delta and disk, we know the format is not ideal. And we know we could optimize the

24:04.940 --> 24:09.820
common part a bit more for not just doing the prefix and the sticks. There is a lot of things

24:09.820 --> 24:16.220
we could do more, but it really doesn't matter anymore in terms of how do I improve the

24:16.220 --> 24:23.420
performance of that specific operation right now. So if we take a bit of a step back on what

24:23.420 --> 24:35.500
happens here, we did just a lot of small things that are not that advanced. We are very doing use

24:35.580 --> 24:43.180
this work. We made the machine app by doing things like memory locality, fewer allocation.

24:45.420 --> 24:51.900
We looked a bit at the data structure we add to leverage them by looking at the data to detect

24:51.900 --> 24:57.980
the common part. It's not we didn't invent a new data structure or a new algorithm. Like we already

24:57.980 --> 25:03.580
add that it's basically a few line on code to start at the beginning of the first one and

25:03.580 --> 25:10.540
the beginning of the end one and that's it. And yet we still have a lot of things we could keep

25:10.540 --> 25:18.460
improving over time. That the whole thing is not really concerned because the top level operation

25:18.460 --> 25:23.580
is one over time faster, but it's one over time faster because the delta computation dot

25:23.660 --> 25:37.100
about 500 times faster in that area, which is if it was a problem now and it took 500 times more

25:37.100 --> 25:44.460
before you would kind of not really be able to run it because if it was taking one hour or now

25:44.460 --> 25:50.300
it would mean it would take like 20 day before and would be in practical and practical to do at all.

25:54.380 --> 26:00.380
And this all started by all we should do fancy stuff to avoid this costly operation because

26:00.380 --> 26:06.380
this costly operation is inherently complicated because the math say it's in square, so if it's

26:06.380 --> 26:11.340
slow it's because the math say it's in square, right? But no, it wasn't.

26:12.300 --> 26:17.420
So like just doing basic profiling to understand if thing like what is actually happening,

26:17.980 --> 26:23.420
doing tracing to see how many of some operation you do, like if things trigger when they should

26:23.420 --> 26:30.380
not and this kind of things, challenging the work that everything do. It walks for all this,

26:30.380 --> 26:37.180
that earlier as this means we've sped up a G committee by about 10% because we realized that

26:37.180 --> 26:43.740
we were computing the shy one of something and then checking that this content at the

26:43.740 --> 26:50.940
shy one we just computed because we can't 20 years of development, the code grew organically

26:50.940 --> 26:59.100
and there is this kind of stupid thing that can emerge. Thinking a bit about your data can

26:59.100 --> 27:05.180
give quick speed up as we see with detecting common parts from the data structure.

27:05.340 --> 27:10.460
And you do that for your code, but you should probably do that for your dependency too,

27:10.460 --> 27:15.900
like you're calling a library and the code to that library takes times, maybe it's what

27:15.900 --> 27:21.340
it's worth looking at why is this library called text time instead of trying to work around it.

27:21.340 --> 27:28.540
It walks also for the links canal, reading data from this is slow and so maybe you do it

27:29.340 --> 27:36.140
but is it slow because reading data from this is inherently slow or is it because you're

27:36.140 --> 27:41.820
having a lot of page fault and therefore you do all the friendships of the channel because you

27:41.820 --> 27:46.140
didn't warm your and map properly or because you're doing committee reads and think like that.

27:46.140 --> 27:50.860
There's a bunch of case where you think that oh this is just a canal being slow or the

27:50.860 --> 27:55.340
this being slow and actually know like look at what is actually happening before doing things.

27:56.060 --> 28:04.700
And the last interesting lesson I think is that code ages. You could have code that has been

28:04.700 --> 28:12.540
carefully written by smart people a long time ago but the hardware change, the programming language

28:12.540 --> 28:19.180
change, the usage change and so something that was efficient 20 years ago might not be

28:19.260 --> 28:25.260
efficient today doesn't use some of the new optimization and think like that so even if something

28:25.260 --> 28:35.900
was good in the past maybe it's not good now and during during this whole talk I said that

28:35.900 --> 28:47.100
Diffing is an end square algorithm and therefore it's slow and this was true in 2005 but

28:48.540 --> 28:55.420
it's longer the case like in 2015 someone came up with linear algorithm for Diffing that would

28:55.420 --> 29:01.900
be faster would allow us to have a different structure and model for delta that would be even

29:01.900 --> 29:08.700
more efficient and it's been that even the mathematics that we work on can be a box and sometimes

29:09.980 --> 29:16.940
things move in that area too and of course I'm not saying that if you're currently slow you should

29:16.940 --> 29:22.220
get go get a PhD in computer science to make it faster but there are people where the PhD in

29:22.220 --> 29:27.100
computer science and there are making things faster and so you can check what they're doing

29:27.100 --> 29:32.140
from time to time and sometimes you can speak to them and they will have good idea about your problems

29:35.340 --> 29:41.660
so thank you for coming to MacLeod I hope it was useful for you and if you have any questions

29:41.660 --> 29:42.940
don't hesitate to ask them

29:42.940 --> 30:11.820
Thank you for the talk Life I have one question about the delta because in your graph

30:12.940 --> 30:18.380
you had your your starting point you have your small delta and then you put all together and

30:18.380 --> 30:24.780
so in a really convenient to go to the tip of your branch but if you want to go in the middle of

30:25.900 --> 30:31.900
your branch to just look at the history obviously your big delta you have to apply just a subset of it

30:32.860 --> 30:38.380
you mean that yeah it was about that the big drop now after that

30:39.020 --> 30:52.060
I was seeing about the big yellow drop the big yellow drop

30:52.140 --> 31:04.060
number one yeah okay and so watch your questions so here if you want to go if you want to

31:04.060 --> 31:11.740
check out to A it's simple you apply C and A but if you want to go to somewhere inside C how you

31:11.740 --> 31:17.820
do this oh so there's not about checking out the content there could be more of the things before

31:17.820 --> 31:24.540
and like if you have a chain like if I want to get this content I need to apply this delta and this

31:24.540 --> 31:30.620
but if I want to get this content I just need to apply this chain to this and so forth

31:32.620 --> 31:38.620
here the point is that you have two arbitrary revisions that have something in common

31:39.340 --> 31:45.020
and you can come to a difference between these two revisions efficiently with that without adding

31:45.100 --> 31:55.020
to restore the full content of them that makes sense yes but the big drop the C big drop so

31:55.020 --> 32:01.420
that's not yeah it's big thing yes what I understand it's like a really big patch that you can

32:01.420 --> 32:07.660
apply to the red part yes and so if you want to apply a subset of this thing because you want to go somewhere

32:07.740 --> 32:20.140
in the history yeah the way it works is that since we know which part A and B are going to touch

32:20.140 --> 32:26.300
we don't need to restore the full text and then just take that part we can just take a small part of it

32:26.300 --> 32:46.380
no yeah okay okay I think other questions okay then thank you for your talk

32:56.300 --> 32:59.900
you