WEBVTT

00:00.000 --> 00:08.960
Hey, well, once again, welcome to your next talk.

00:08.960 --> 00:16.880
We have here Pierre Reeves, Eves, and Raphael, presenting material 20 years and counting.

00:16.880 --> 00:18.960
So thank you very much and take it away.

00:18.960 --> 00:27.120
Hi, hello, can everybody hear me fine?

00:27.120 --> 00:29.440
Yep, cool.

00:29.440 --> 00:33.880
So we are Pierre-Vanafel.

00:33.880 --> 00:39.000
We work at a small company called Oktoberz, we're specialized in version control, and

00:39.000 --> 00:45.240
specifically in Mercurio, a version control software that is 20 years old and we're going

00:45.240 --> 00:48.440
to talk about its journey and why we're still here.

00:48.440 --> 00:52.840
And why we're here today talking about it.

00:52.840 --> 00:58.040
So first we're going to start with an interactive slide.

00:58.040 --> 00:59.240
Get your hand up, please.

00:59.240 --> 01:00.240
Everyone.

01:00.240 --> 01:01.240
Yeah, thank you.

01:01.240 --> 01:02.240
Everybody raise your hand.

01:02.240 --> 01:03.240
Keep your hand up.

01:03.240 --> 01:09.960
If you knew that Mercurio existed before coming to this talk, yeah, pretty much.

01:09.960 --> 01:17.440
If you've ever used it before, okay, fewer people, if you are using it today.

01:17.440 --> 01:23.360
Yep, some people are.

01:23.360 --> 01:29.160
If you think the project is active, more people raising their hands again, nice.

01:29.160 --> 01:36.720
Okay, so we thought that it might go something like this.

01:36.720 --> 01:44.440
I think the blue E-lips was kind of maybe overconfident in its size, but generally I think

01:44.440 --> 01:50.120
that a lot of people have heard about HG, this is kind of a self-selecting audience, but

01:50.120 --> 01:55.320
if you go around most people have heard about Mercurio at some point, but the bull that's

01:55.320 --> 02:01.080
still used Mercurio and know that its active is actually quite a small minority now.

02:01.080 --> 02:05.120
So how Mercurio is usually portrait is, it's a distributed version control system that's

02:05.120 --> 02:12.320
very old, it lost to get 10 years ago, and it's written in Python and it's slow.

02:12.320 --> 02:16.000
It has 95% market share, Mercurio has 1%.

02:16.000 --> 02:18.920
That's the usual framing of it.

02:18.920 --> 02:23.440
So why talk about it, why am I here, why are we here?

02:23.440 --> 02:29.280
We keep having the same conversations over and over again at every event and every conference,

02:29.280 --> 02:38.000
every person we bump into in the hallway, and we have to say that we are paid full-time

02:38.080 --> 02:43.400
to work on it, have been since 2013 for Piav and 2019 for myself.

02:43.400 --> 02:49.280
We are not the only ones to be full-time paid and we have volunteer contributors.

02:49.280 --> 02:57.400
We are still doing a lot of things, and we want to talk about it because it is, as I said,

02:57.400 --> 03:01.720
still active, still innovative, and still influential.

03:01.720 --> 03:04.160
So let's give an example.

03:04.320 --> 03:13.320
I can run an AG status that takes 100 milliseconds to complete for 1 million files on disk.

03:13.320 --> 03:18.720
No other version control system can achieve those speeds, especially without ionotify

03:18.720 --> 03:21.600
and within a single process.

03:21.600 --> 03:26.240
That means that this is significantly faster than Git, at least in its current form,

03:26.240 --> 03:31.280
and faster than actually using ionotify as well, for those of you who know FS Manager

03:31.520 --> 03:38.400
all of that, it is faster than using ionotify, and it is definitely faster than ionotify when it breaks.

03:39.440 --> 03:41.760
Mercurile is still influential.

03:41.760 --> 03:49.520
Two years ago Scott Chakon of Git Butler and GitHub fame gave a talk about Git and the recent

03:49.520 --> 03:57.760
advances recent at the time of Git, and he talked about six different points, which we will not get into,

03:57.760 --> 04:03.680
but three of those with the check marks came from Mercurile or the Mercurile community.

04:03.680 --> 04:09.040
Having a commit graph to speed up operations, the fastest money to think I was just talking about

04:09.040 --> 04:15.360
was first introduced in Mercurile, and the ability to check out this smaller subsets of your

04:15.360 --> 04:18.320
entire repo was a Mercurile thing.

04:19.760 --> 04:23.920
The previous talk, I don't know if anybody stayed over from the last talk, but if you did,

04:24.560 --> 04:33.600
thanks, yeah, and so just to talk before Patrick talked about a bunch of stuff, and two of those

04:33.600 --> 04:39.600
are directly from Mercurile heritage. The logic of the promises not in their implementation,

04:39.600 --> 04:45.520
but the practice of having large files that are transparent to version control has been around

04:45.520 --> 04:53.200
for 10 years in Mercurile, and the Git history command and the absorb concept and having one command

04:53.200 --> 04:58.240
to manipulate your history has also been there for 10 years. Now, at this point.

04:59.680 --> 05:07.840
So, how did we get here? It's 2026. We have about 45 minutes left.

05:09.120 --> 05:14.080
Recapping 20 years would take about two minutes per year, 30 seconds per version,

05:14.080 --> 05:19.440
and 15 milliseconds per commit. Thankfully, we are not going to dive into every single detail,

05:19.440 --> 05:24.000
and we have a more structured approach to this presentation, and I will head it over to Chive.

05:25.840 --> 05:32.320
Okay, so before going into what Mercurile is today, let's get about where Mercurile comes from,

05:32.800 --> 05:39.840
and from the dawn of time, and by that, I mean January 1st, 1970, people have been trying

05:39.920 --> 05:49.760
to version control their code, and at that time, for a while, we add a new tool about every decade

05:49.760 --> 05:55.200
that improves over the other one, and the field improves slowly like that,

05:55.840 --> 06:01.920
neutral replacing the other. However, something different happens in the 2000s, where

06:02.000 --> 06:07.680
internet, encourage collaboration and things like that, and a lot of new tools arrive. Like,

06:07.680 --> 06:14.080
we have new arch, darks, monotones, actually all-gid, but our fossil, probably a lot of other tools.

06:14.080 --> 06:18.880
If your tool is not there, and it's still alive, please come make it up first,

06:18.880 --> 06:23.200
then next year, to explain why. Let's zoom a bit on the material and get things.

06:23.920 --> 06:32.960
People often don't realize how close the project are. If look at what they do and what they

06:32.960 --> 06:39.600
come from, they where release the same month at a couple of weeks about in April 2005,

06:40.480 --> 06:47.280
they were developed for versioning the Linux kernel, they were developed by canal developers,

06:48.240 --> 06:54.160
and they took inspiration from a previous version control called monotone, and

06:55.360 --> 07:02.480
their main data schema is basically a Mackel graph using shawan to identify content. There

07:02.480 --> 07:09.520
is, of course, a lot of difference in the implementation, but there is also very much common things.

07:09.520 --> 07:15.440
There is, at the conceptual level, there is only one big difference that we will get into later

07:16.400 --> 07:22.400
but overall, there is a lot of equivalence between the two. It's also a very good time

07:22.400 --> 07:27.040
for version control because you have a lot of tools, they all try different ideas, they exchange

07:27.040 --> 07:33.200
ideas, like it's a small example, where material is giving is interesting feature that

07:33.200 --> 07:38.320
is going to be used, it is interesting feature that material is going to be implemented,

07:38.320 --> 07:43.920
and also tool that didn't add as much success, like darks, still do a lot of interesting

07:44.880 --> 07:49.440
innovation regarding patches, partial commits, partial application, this kind of thing.

07:52.640 --> 07:59.760
And like material, as advantage at the time and at the green user base, at about the same

07:59.760 --> 08:06.480
base as git, it usually pick because at a nicer UI, it adds more feature like copy tracing,

08:07.440 --> 08:14.400
it was also much more portable than git, because it's Python, it could run efficient,

08:14.400 --> 08:22.320
like without much we work on Mac on Windows, and still nowadays, we have add people using

08:22.880 --> 08:29.200
material on plan 9 or OpenVMS, and things like that, like very strange platform, because

08:29.200 --> 08:34.720
it much more possible, it's kind of still today. It had a peripheral extension system that

08:34.720 --> 08:41.120
helped people to do more things, it had early editing, story editing, and then more advanced,

08:41.120 --> 08:48.960
story editing, last-family link came early, there were multiple iterations of that, and it had

08:48.960 --> 08:55.600
overall a safer UI with a philosophy of we want to prevent the user to do mistake instead of

08:55.600 --> 09:03.440
helping the user to undo the mistake later, and Benchof adventure show. So the two projects

09:03.440 --> 09:08.320
started the same time with a lot of thing in common, and for a while they grow together in

09:08.320 --> 09:15.360
an adoption, they are like small player in a wider field, dominated by other stuff, but at some point

09:16.400 --> 09:22.720
git started to grow significantly faster, and Mac, you all start to stagnate and slowly

09:22.720 --> 09:27.200
reduce its user base. There is a lot of reasons why it might add going this way, like it's not

09:27.200 --> 09:33.680
a simple black and white answer, but one key element in there is the introduction of GitHub.

09:34.560 --> 09:39.920
It is a comparison of Google Trend between GitHub and BitBicket, BitBicket being a fraud,

09:39.920 --> 09:45.680
supporting MacRail, and you see that they are both starting in 2008, and GitHub immediately

09:45.680 --> 09:53.680
started to pick up, and faster and faster and faster, and compared to GitHub, BitBicket is basically

09:53.680 --> 10:03.360
flat, until about 2012, which is the moment they introduce git support. So with this, we see that

10:03.360 --> 10:10.000
MacRail will definitely last the battle to MacRail, and was kind of on the trajectory to be like

10:10.000 --> 10:16.400
the other two, like, better or dark, to less users. However, it didn't happen that way, and why.

10:17.600 --> 10:23.280
In the 2010s, more or more people were doing MonobiPos. You have one source repository, where

10:23.360 --> 10:29.840
you put all your source code. And especially, there are two companies that join the MacRail community

10:29.840 --> 10:40.080
to under the MonobiPos, and they are Facebook and Google. And because they joined the project,

10:40.080 --> 10:43.920
a lot of things happen in MacRail at the time, that didn't really happen elsewhere.

10:44.640 --> 10:51.520
I'm not going to cover them in details, but we add a lot of new features that landed at that time.

10:51.600 --> 10:56.480
Tell them then you might recognize because they are getting now in other stuff, but a lot of things

10:56.480 --> 11:07.120
happen at the time. So why did these people pick MacRail instead of Git to deal with the MonobiPos?

11:07.120 --> 11:13.040
Git was different, it's a leader, it's kind of weird to pick a small pair like MacRail. There is,

11:13.840 --> 11:20.560
first they tried. There is email thread you can check, where Facebook go on the Git mailing list,

11:20.640 --> 11:25.680
and say, oh, we are trying to scale Git at that scale, and we have troubles. Do you want like

11:25.680 --> 11:31.600
millions of dollars of engineer's time to speed up your project? And the answer was kind of,

11:32.160 --> 11:41.520
no, I mean, you can check this right. And there was other argument of why MacRail was a good pick

11:41.600 --> 11:48.560
to do this. And one of them, the two single one is about the Git API.

11:51.920 --> 12:00.160
So aside from the, what you just said, the human reasons, the philosophical divergence between

12:00.160 --> 12:04.160
what Git was trying to do and what Facebook and Google were trying to do, there was also,

12:04.480 --> 12:11.440
there's also a difference in fundamental differences in the model of Git and MacRail,

12:11.440 --> 12:18.640
and Git's API kind of pates it themselves into a corner. So what is Git's stable API?

12:19.360 --> 12:26.480
You might have heard Git presents these layers as I have the personal layer, which is the user

12:26.560 --> 12:35.200
facing stuff. You have plumbing, which is the more lower level data handling layer. You

12:35.200 --> 12:42.640
have the binary files that actually live on your disk. This is the API that is presented in the,

12:42.640 --> 12:49.760
in the history of Git. And in practice, this is what happened. You had all of these different tools

12:49.760 --> 12:59.120
that didn't really talk to the CLI or did it partially. And directly, and directly just looked

12:59.120 --> 13:04.560
into the binary files. So your API became the binary file. If you're writing as it is

13:04.560 --> 13:09.920
H prompt or if you're a lib Git to or GitHub site or whatever, I mean GitHub site is slightly different,

13:09.920 --> 13:16.720
but that's kind of the same point, which is that they are all looking at some way in one way or

13:16.720 --> 13:25.360
another looking at the binary files on disk. The previous talk went into a detail as to how they're

13:25.360 --> 13:31.840
trying to change that being a problem and having a library nowadays. But this was definitely the

13:31.840 --> 13:39.840
status 10 years ago and is still the case now. Having this structure has some advantages. It's

13:39.840 --> 13:44.560
simple. The Git structure, the initial Git structure is quite simple. It's actually quite brilliant.

13:44.960 --> 13:53.440
It's very easy to access. And it helps growing tooling quite fast. For example, if you want to build

13:53.440 --> 14:02.560
an IDE, if you want some, some graphing tool, a GUI or like a ZSH prompt, for example, or whatever,

14:02.560 --> 14:08.080
a shell prompt. This is quite easy. You could just plug in read whatever data you need and be done with it.

14:08.080 --> 14:16.720
So it helps foster an urgent environment of experimentation in that sense. However, it is very hard to

14:16.720 --> 14:22.880
change. I think I don't need to get into detail too much as to why, but it is quite hard to change.

14:22.880 --> 14:27.680
Whenever everybody looks at the need to greet the details, whenever you change any detail,

14:27.680 --> 14:33.600
everything breaks. So there's also the aspect of it being simple, doesn't mean that it's right.

14:33.680 --> 14:38.240
And especially it doesn't mean that it will keep being right. Git might have been the right answer

14:38.240 --> 14:44.720
to the problem that it was trying to solve in 2005, but the world keeps changing and evolving

14:44.720 --> 14:52.480
and the scales are not the same, etc. And so it being simple is a good thing, but it's not enough.

14:54.480 --> 14:59.760
And this is kind of what we were saying. If you want to change everything, if you want to change

14:59.840 --> 15:04.560
anything, you need to convert the whole ecosystem first and lockstep before changing anything.

15:05.600 --> 15:14.160
One example of this particular stack problem and binary phases in API problem was Microsoft in 2017.

15:14.560 --> 15:24.480
They were trying to scale their large Git repo and they tried to look at what was the performance

15:24.480 --> 15:30.240
problem and the performance problem was the access to the .Git directories. And so what did they do?

15:30.240 --> 15:34.240
They implemented a virtual phase system because that's the only way you can do it.

15:35.680 --> 15:41.280
If your interface is the file system, then the only way you can change the interface

15:41.280 --> 15:48.480
without significant community efforts is by doing a virtual phase system. Another example is the

15:48.480 --> 15:56.240
Reftable thing, which is instead of having a bunch of all of your Git Refts stored as single files

15:56.240 --> 16:05.040
in a folder. It's a binary packed format. It was decided in 2013. It was first implemented in 2017 in

16:05.040 --> 16:13.120
J. Git and then became part of canonical Git only in 2024, which means that you have a 10-year

16:13.200 --> 16:20.320
span between the start of the project and it's becoming a reality for everyone.

16:21.040 --> 16:25.600
That doesn't mean that Microsoft hasn't had this fair share of stuff that has lagured over the years,

16:25.600 --> 16:33.360
but this is kind of a good example of how there's an immense inertia to change.

16:34.400 --> 16:41.200
In contrast, the Microsoft stable API is much higher. Remember the layers I was showing earlier,

16:41.280 --> 16:47.360
Mercurals API, the official one, and the one that most projects have ever been plugged into,

16:47.360 --> 16:52.720
is the CLI or the server, the server being the wire protocol, talking about clone push pull,

16:52.720 --> 16:58.800
that kind of stuff. Some things plug into the logic or the data access code, but they are

16:58.800 --> 17:04.480
extensions that are forewarn that everything can break at every cycle and often does, which means

17:04.480 --> 17:12.640
that they have usually been more tightly coupled with the upstream project, meaning that

17:12.640 --> 17:18.640
everything moves together and encouraged to be thought of as a whole project.

17:21.760 --> 17:32.720
This layer was very useful for all the companies and the contributors trying to create

17:33.200 --> 17:38.880
new workflows and new ways of doing version control because you could have, you could change,

17:38.880 --> 17:46.400
repout everything from underneath the user without them realizing which unlock scaling capabilities

17:46.400 --> 17:51.600
and many different features. Because it turns out that some distributed version control systems

17:51.600 --> 18:00.720
are more distributed than others, so I'll let PI talk. So another aspect that is more subtle

18:01.200 --> 18:07.680
and that didn't really explain a bit the scaling aspect, but it also explained why

18:07.680 --> 18:14.160
there has been so much innovation in terms of user interface and workflow within the Mercurals space,

18:14.160 --> 18:23.040
is as to do with the concept of our Mercural modelized data. And it's a complex topic and I don't

18:23.040 --> 18:28.320
want to turn this into a full computer science lecture, so I'm going to take a small example

18:28.480 --> 18:36.000
to try to make that clear. The main difference when the Mercural start between Git and Mercural

18:36.000 --> 18:42.160
is the Bernchick model in terms of concept. They both have a graph, like you have commits,

18:42.160 --> 18:48.320
they have parents, you do things. So they both have run to the very same way, like actual

18:48.320 --> 18:54.800
topical branch, but there is difference on how you modellize it in the UX and how you deal with

18:55.680 --> 19:02.800
it when you exchange. In Git, a branch is a label that you put on a commit that is going to move.

19:02.800 --> 19:08.160
So on the left you have the Git version where you have the main branch that is unaccommit and you

19:08.160 --> 19:14.320
have the shrub branch that is unaccommit and how you define your shrub branch. In Mercural, the branch

19:14.320 --> 19:20.720
information is within every commit. It's a property of the commit. You make a commit on the different

19:20.720 --> 19:27.040
branch or you make a commit on the shrub branch. And from that state, the state of the graph,

19:27.040 --> 19:32.640
you can compute where the edge of shrub is and where the edge of default is, but it's a property

19:32.640 --> 19:40.480
that you that emerged from that and not a label that is explicit. So when you get a new ed, for example,

19:40.480 --> 19:46.080
when you pull from somewhere, in Git you have to create a new branch with a label with a conversion

19:46.080 --> 19:51.520
that say, oh, this is actually kind of the same thing. Well, in Mercurial, it kind of just happened.

19:51.520 --> 19:55.680
You pull a new chance with the same label. So if you look at the graph, you see there is two

19:55.680 --> 20:01.120
add-on-through because that is the fact that there is two add-on-through. A recent health after that

20:01.120 --> 20:07.120
is kind of pretty much the same. Like you can do merges and in Git you move the label in Mercurial,

20:07.120 --> 20:12.480
you just appears to only have one ed remaining. You can manage them back into another branch

20:12.720 --> 20:16.720
and think like that. There is a very useful way to do branching in Mercurial,

20:16.720 --> 20:22.400
some are persistence, some are not persistent, I'm not going to get into that now to focus on the

20:22.400 --> 20:28.960
core differences. But the important part is that you just, you have a simple model that can

20:28.960 --> 20:32.880
express any kind of complexity in your branch, like how many branches you have and how many merges and

20:32.880 --> 20:37.920
things like that. So this is not something you would actually see, but this is no more complex

20:38.880 --> 20:44.080
at that model level and at the conceptual level. This weird branch with the turn of ed and turn of

20:44.080 --> 20:53.520
merges is no more complicated to express than the usual two add-sings. And so the way to see that

20:53.520 --> 21:00.800
is that Mercurial has a global state where you action, everybody sees the same thing,

21:00.880 --> 21:10.560
was while Git has more or for local state where the branch has a local value. And the way to see

21:10.560 --> 21:21.680
it is when you add the two add in if we look at that branching slide. When you do this

21:22.000 --> 21:28.240
synchronization, that is the previous one. When you use this synchronization, that as the two the

21:28.240 --> 21:34.160
true branch, in Git, this is the time where you get to branch. Like you pull something locally

21:34.160 --> 21:41.520
and a new branch is created and now you have to add. In Mercurial, you add the two adds all along,

21:41.520 --> 21:47.280
like you already have the two adds that exist when the two people do the commit on their machine.

21:47.520 --> 21:51.680
And when you synchronize yourself, you discover like someone discovered it as two adds,

21:51.680 --> 21:59.760
but they were here all along. Getting back here. So we have something that is more global and

21:59.760 --> 22:06.400
shared in Mercurial. It's not just necessarily better or worse, but it's different. And especially,

22:06.400 --> 22:10.960
it means that the branch in Mercurial are just that attitude exchange the same way that every other

22:10.960 --> 22:17.920
data. While in Git, you have to have specific solution. Like you have the remote naming,

22:17.920 --> 22:23.440
so that the branch, I have a specific name, you have a solution for tracking, you have a solution

22:23.440 --> 22:28.160
for false push and this kind of thing. You have to have specific solution for a specific problem

22:28.160 --> 22:34.640
while in Mercurial that just basically emerge. It gives you a model that is more flexible that

22:35.600 --> 22:40.720
propagates simpler. And this apply for pretty much every other concept in Mercurial. Like we have

22:40.720 --> 22:45.520
the face concept that prevents you to be white stuff that you not be bewwritten, the same like

22:45.520 --> 22:50.960
it's global in propagates. We have the same four tags. There is a bunch of problem with the

22:50.960 --> 22:56.720
tag model, but the propagation is not really one of them. Like it's as the same property. And

22:56.720 --> 23:04.080
pretty much every thing else in Mercurial, stick to that model. So the global model of the data is

23:04.080 --> 23:12.080
the CRDT, CRDT is done from config free replication data structure, which means you can

23:13.120 --> 23:19.760
you can propagate the tree. The Mercurial data model, model in inevitable state. The fact that you

23:19.760 --> 23:24.480
can have to add at some point is something that will happen because you do distributed development.

23:24.480 --> 23:30.960
It's not something we invented. It's just we model something that exists. And it's a valid state

23:30.960 --> 23:36.160
in the model. It's not something you have to fix right away. You can keep doing other stuff. You can

23:36.160 --> 23:42.240
ignore this to it for our law you want. You can ever exchange them with someone. Like if I

23:42.240 --> 23:45.840
need to merge something and I don't know how to merge it. I can send it to Raphael. It will

23:45.840 --> 23:49.840
merge it and I will get the merge. And there is no problem like that. It's valid state. You can

23:49.840 --> 23:57.040
exchange and propagate. It means you have a single unified concept too. Like we don't care how you

23:57.120 --> 24:02.400
get to add. Maybe you get to add because you pulled something or maybe you got to add because

24:02.400 --> 24:07.920
you imported the bundle or imported an email or because you created themselves. All the way

24:07.920 --> 24:13.440
all the different ways in the same state. You can solve it different ways from the same state.

24:13.440 --> 24:18.560
You have something unified that really limit the complexity when you start to combine this concept

24:18.560 --> 24:25.920
together because they converge to the same thing. And that unlocks a lot of options in terms of

24:26.000 --> 24:32.880
workflow and operation. You can do your own scaling and things like that. A good example of that

24:32.880 --> 24:41.920
is the way Maxwell do history with writing. In Macriel you have an evolution history for the

24:41.920 --> 24:47.920
transit which means that whenever you do a history with writing operation like an end-end or

24:47.920 --> 24:53.920
rebate or something, we explicitly tracked what happened with new data in the repo. You can see

24:54.000 --> 25:01.920
that as the history of the history. That history of history is a CRDT. So it has all the property

25:01.920 --> 25:08.720
we talked about before. It's easy to exchange. The state that it represents, you just compute

25:08.720 --> 25:16.160
state because it's like which sunset are there, which sunset are alive, what kind of problem you have

25:16.160 --> 25:20.960
when you start to avoid the distributed setting. We can express them because they exist and

25:20.960 --> 25:27.200
they are valid state. So we can automatically be fixing issue and we can do again like

25:27.200 --> 25:34.640
exchanging things with other people, propagate data and things like that. And this create a kind of

25:34.640 --> 25:42.880
even for stag default flow. Because when you do a command in any tool, you need to move from

25:42.880 --> 25:49.680
the stable state to a stable state. And because we have model that are much more expressive,

25:50.640 --> 25:56.960
we can have finer state that get in get when you do rebates dash i, you need to move from

25:56.960 --> 26:03.280
a linear branch with one add to a linear branch with another add. While in Macreal, you can have

26:03.280 --> 26:09.520
a lot of intermediate state that have valid representation. We can do that locally, locally and

26:09.520 --> 26:14.720
globally. You can express a lot more state also when you collaborate with other people in this

26:14.800 --> 26:20.480
suited way. And that get as faster to win growth in that area. People can play with it

26:20.480 --> 26:26.560
a lot more common, we will work flow and sing like that. An example of that that you might have

26:26.560 --> 26:34.400
sing quickly is like rebates and push. They use something like use something where when you push a

26:34.400 --> 26:38.960
chance it is going to be rebates server side and then serve back to you. And they don't have to

26:38.960 --> 26:44.480
invent a specific solution for that. They can just build over all the things that already

26:44.560 --> 26:49.120
exist that just represent that operation happen. It's no different from your rebates locally

26:49.120 --> 26:56.240
or your collect release. All the things converge to the same thing. I would like we could talk

26:56.240 --> 27:06.000
about this much longer but there is all of it to be covered. Let's switch gears a little bit

27:06.000 --> 27:13.200
and go maybe away from the technical aspects of this and more about what happens when

27:13.200 --> 27:17.440
giant corporations get into your open source project and start contributing.

27:19.680 --> 27:26.080
There are very obvious upsides that I want to underline. This talk needs to talk about the

27:26.080 --> 27:32.400
problems that doesn't mean that we didn't enjoy the upsides. There are more contributor in general,

27:32.400 --> 27:40.480
more contributing, more funding for CI and for infrastructure and for sprint hosting which means

27:40.480 --> 27:47.840
that people's hotels can be paid for them and we can exchange for travel, etc. And also quite

27:47.840 --> 27:52.720
importantly the project becomes more credible. When you hear, oh, make your old powers, the

27:52.720 --> 28:01.200
Google, the UI or whatever. Then it's people think that it's quite important. It gives way to the

28:01.200 --> 28:07.040
project. There are many upsides. I don't think anybody is really interested in releasing all of the

28:07.040 --> 28:14.000
upsides. Again, we're not forgetting them. But there are of course many downsides. You have a

28:14.000 --> 28:22.000
big problem of sovereignty of how your project can manage itself. One thing that can be subtle

28:22.000 --> 28:26.880
is that you have contributor drain. So in a usual open source project that is lucky enough to

28:26.880 --> 28:33.040
have contributors because that's not most of them. You have people from all over the world and

28:33.120 --> 28:40.720
all over different profiles. You have the weekend warrior and you have someone who is in a small

28:40.720 --> 28:47.120
company and you have all of those different profiles. And when a very large company comes in, they

28:47.120 --> 28:51.840
have large problems. They have a lot of money. And so what they do is they approach people from other

28:51.840 --> 29:00.880
companies. They hire people from the community which gives a boost of productivity and a boost in

29:00.960 --> 29:07.600
in focus, which is nice. But also what happens is that your contributor base becomes all of your

29:07.600 --> 29:16.880
ex kind of moving to giant baskets. And mid-sized companies have no consultants left for them.

29:17.520 --> 29:23.360
So your problems are spaced out between the few people that are here because they enjoy the sport

29:23.360 --> 29:30.480
and the people that are fixing their companies' problems. So that leads to a general loss of

29:31.040 --> 29:37.120
organizational skills as a project. For example, when I joined the project in 2019, which was near

29:37.120 --> 29:46.400
the kind of the end of that era, we still had post-lending CI, meaning that changes were put into

29:46.400 --> 29:53.280
the project before they passed through any form of rigorous testing. Maybe it was testing on someone's

29:53.360 --> 30:02.400
machine who knows. But it was usually like we caught break it after it was landed. We had no first

30:02.400 --> 30:07.040
class forged to speak of. We had ways of exchanging patches and doing code review, but it was not

30:07.760 --> 30:14.000
material was not a first class citizen of any of those those projects. And on a more

30:14.480 --> 30:24.880
more organizational scale, we lost our ability to do hosting and sponsoring pipeline in general

30:24.880 --> 30:30.320
of how you get money into the project and how you distribute it kind of got lost in all of this.

30:31.920 --> 30:36.640
Another way of looking at it is the product versus toolkit dichotomy.

30:38.080 --> 30:42.320
So a product is something that is plug and play. In general, you can buy it, you can download it,

30:42.400 --> 30:47.680
you can install it, you can whatever. And then you maybe tweak a bunch of config knobs and then it works.

30:49.680 --> 31:02.000
A toolkit needs some assembly. A way of seeing it is a material became a toolbox for larger companies

31:02.000 --> 31:11.280
to have a version control system and build theirs on top of it. And so the project started not

31:11.280 --> 31:19.840
being as ready to use outside of these environments and needed more configuration, maybe even

31:19.840 --> 31:25.440
some infrastructure. And let me tell you, you cannot just APT install the infrastructure from Google.

31:26.640 --> 31:32.400
Some companies were better at this than others. I took Google as an example, but they were actually

31:33.280 --> 31:37.840
quite good at upstreaming their changes and working with us as long as they were still using

31:37.840 --> 31:47.200
material. Some companies were not as good. This kind of moves to the works for me,

31:47.200 --> 31:52.000
probably works on my machine works for me thing. When you contribute something upstream in

31:52.000 --> 32:02.000
material, for example, and Google had everything in their infrastructure is so controlled that

32:02.000 --> 32:05.760
they actually don't really care about hashes. When you think about a committee, you think about

32:05.840 --> 32:11.200
it's hashes and you know that you can identify the exact tree that is under this hashes and you know

32:11.200 --> 32:16.800
exactly what that is byte provide. They don't really care about that. And so sometimes they implemented

32:16.800 --> 32:24.000
features that kind of didn't make sense if the hashes were unstable. It only made sense because

32:24.000 --> 32:30.400
they didn't care about hashes, for example. And sometimes other things were upstreamed with no real

32:30.400 --> 32:36.480
server implementations. You only had the client code, but the part that actually made it work,

32:36.480 --> 32:45.840
the server side was only on their machine and not in the project. And this is kind of the

32:45.840 --> 32:53.360
whole point of the incentive mismatch. People that work in large companies usually try to optimize

32:53.360 --> 32:57.360
for internal impact. What they want to do is they're hired to do a job and they want to fix

32:57.440 --> 33:02.240
an issue quickly. They have their managers breathing down their necks. And what they need to do

33:02.240 --> 33:07.760
is optimize for how much impact can I create within say this six months cycle.

33:09.360 --> 33:13.680
Something that is good for the company is not always good for the project and vice versa. So you

33:13.680 --> 33:18.720
have friction there. You have something. It's not that everything was bad or anything, but it

33:18.720 --> 33:26.320
introduces friction and the incentives are not very much aligned. You can have multiple companies

33:26.400 --> 33:30.240
that fail to synchronize on efforts that are actually quite common. We had a use case in

33:30.240 --> 33:35.200
Macurall that was that the narrow feature and the sparse feature, one of them by Google, one of them

33:35.200 --> 33:42.400
by Facebook, were kind of trying to do a similar thing. Like it was in a different scope, but a bunch

33:42.400 --> 33:49.280
of the concepts were very similar. And it was almost refactored together, but not really. And now

33:49.280 --> 33:57.360
it's kind of in a half states where nothing is really done correctly because each company had

33:57.360 --> 34:03.840
different incentives and didn't want to go through the work of making it a maintainable upstream.

34:03.840 --> 34:09.760
Because not many people get promoted by maintaining software. In general, it's more about

34:10.640 --> 34:17.360
features and speed and stuff like this. Your project can have a different deadline in general.

34:17.920 --> 34:24.960
We have a release cycle. It doesn't always line up with needs of any company, so that also

34:24.960 --> 34:33.840
introduces friction and possibly burn out for the maintainers. That's not great and you finish with

34:33.840 --> 34:44.800
have done features. So this is the kind of the end of the 2010s. We're moving into the 2020s.

34:48.160 --> 34:53.040
This is the end of the era. The tool itself, Macurall has withered the get storm. We're still

34:53.040 --> 34:57.120
around. We're still doing a bunch of stuff and we're still innovating and everything.

34:57.120 --> 35:03.760
But the project was dying. Macurall has a sovereign entity as something that can govern itself

35:04.080 --> 35:11.920
was dying. Then for many different reasons, some of them, the friction that I was talking about,

35:11.920 --> 35:17.520
some of them being external factors, Facebook, forks, internally, Macurall and started something that

35:17.520 --> 35:25.680
is now mononoki and sapping. And Google has slowly started to pivot to JJ and you can see

35:25.680 --> 35:35.760
kind of a downtrend in their involvement. JJ, Jiu-Jitsu, was created at first as an experiment

35:35.760 --> 35:44.960
by Martin who has worked on Macurall for seven years. He was a Macurall developer for a good

35:44.960 --> 35:50.320
while and this first contributor to Jiu-Jitsu is still a Macurall contributor. So this is very much

35:50.640 --> 35:56.400
interesting. This, which means that we are moving to the new wave of the down.

35:58.000 --> 36:01.600
As you might have noticed, there is some movement in the Russian control field.

36:06.240 --> 36:11.120
One of the things we have to keep in mind here is that the Russian control is not the source

36:11.120 --> 36:16.080
problem. Like, get to arrive 20 years ago, but it was not the final answer to everything. There

36:16.160 --> 36:22.080
is a lot of use cases that are not well-covered. Could be, for example, a game developer that

36:22.080 --> 36:28.080
are not completely satisfied with what happens or the AI people are trying to fashion models and

36:28.080 --> 36:33.440
like that. There is also too much friction from non-technical people. Like, the program

36:33.440 --> 36:38.960
solved by soft control is useful for non-developers and we still don't really have a good progression

36:38.960 --> 36:48.320
of seeing like that. Also, the world we live in keeps changing. In the sense of

36:50.400 --> 36:58.240
20 years ago, when we started, we didn't have the AI, like AI is something that happened two years ago.

36:58.240 --> 37:03.360
AI is going to change the way people develop and therefore are going to change the way people use

37:03.440 --> 37:09.440
or control. A bunch of things also change like cloud computing, weathering things really 20 years ago

37:09.440 --> 37:14.560
and it changed the way people are going to deploy. There is a bunch of things that means that

37:14.560 --> 37:22.000
Russian control is a moving target and Russian control also change the field of computing

37:22.000 --> 37:30.400
and therefore move the target itself. Good example of ways not solve is like, people still use

37:30.400 --> 37:35.280
material but people still use SVN. Probably more people still use SVN that people still use

37:35.280 --> 37:40.640
material and they don't use SVN because they are stuck on it. They use SVN because SVN

37:40.640 --> 37:45.120
solved some of their problem better than get and material or any other tool to do.

37:47.520 --> 37:55.360
And scaling is also inevitable which is part of why it's a constant ill-willed problem.

37:56.080 --> 38:02.000
The amount of short-scored we store everywhere just grow. The size of the amount of

38:02.000 --> 38:09.440
code we've write just grow and so you have more and more problem that you need to solve and

38:09.440 --> 38:15.520
something fun is that the thing we usually use to solve getting problem does aren't really scale.

38:15.520 --> 38:21.280
Like if you do content-adracing these ashes, therefore you have look-up stable and like the

38:21.280 --> 38:27.760
bigger of the air of the legs they are performing. A good example is Inotifying. Like using Inotifying

38:27.760 --> 38:32.880
two speed-up stages is a great idea except that you have a limit of the amount of file you can follow

38:32.880 --> 38:39.920
with Inotifying. And the more you need Inotifying the more file you are and at some point it doesn't

38:39.920 --> 38:43.440
really work like if you have mini-walking computers and work if you have a lot of values and work

38:43.440 --> 38:47.680
and just kind of think that there is a bunch of problems that we can't solve now but it's that are

38:47.760 --> 38:54.080
harder to solve later. But we mentioned a new wave of things so maybe there is a new up

38:54.080 --> 39:02.400
of solving more problem. Let's go back to the timeline we have before. We notice that only

39:02.400 --> 39:09.280
a few projects survive the 2000s but there was a very long time between like 10-15 years

39:09.280 --> 39:15.680
between and actually 20 years now but between the time where all the things happen in 2005

39:16.000 --> 39:20.800
and the time where all the things happen now except for pure rule that happen cannot be

39:20.800 --> 39:27.840
middle of that, pure rule spun from darks and pure rule bring more things to the table.

39:28.400 --> 39:35.280
Especially they do more work on the patch theory and they introduce a nice definition for

39:35.280 --> 39:40.400
what is the conflict that is actually a CRDT so that very interesting because you can keep

39:40.480 --> 39:49.520
AVDVSS that is a CRDT. Sampling is a frock from SL from Facebook and they walk more on like

39:49.520 --> 39:55.520
the lazyness, the scaling and they they you've been could use some issue from SDN like

39:55.520 --> 40:04.320
directory level history tracking and things like that. Due to also play with conflict tracking

40:04.320 --> 40:11.120
and they also we you like you a lot of work on the stack-diff workflow and things like that and

40:11.920 --> 40:16.800
it's not just the new tool like get as been more active recently and solving more problem

40:16.800 --> 40:22.000
we're introducing ID doing the NLs where and a lot of things are happening. The tooling

40:22.000 --> 40:27.280
feel like people from game it bit beckler and other stuff a lot of things are happening so it's

40:27.280 --> 40:32.080
kind of exciting time where people are going to share from each other again and new things a new

40:32.080 --> 40:39.360
creation are going to happen which is pretty nice and material play the part in that in the sense

40:39.360 --> 40:43.920
of sapling it's like a direct thought of material it's we use a lot of the get-based lot of the

40:43.920 --> 40:50.320
concepts and some of the UX. Due to it's actually build by macro developers like the creator

40:50.320 --> 40:56.640
was a macro developers the main developers is also a from a macro developers that we use a lot

40:56.720 --> 41:05.680
of the user experience concept from Microsoft and try to pod them to get and get it and I mean

41:05.680 --> 41:11.440
under get that structure and and get itself as a lot of things it gain now that are

41:11.440 --> 41:15.840
inspired more or less directly by macro you are like the coming graph that speed up history

41:15.840 --> 41:20.640
a traversal operation that's something like as for a long time the FS literature we mentioned like

41:20.800 --> 41:27.840
I notify things for us is something learning material in 2013 the walk around the

41:27.840 --> 41:32.480
year experience like absorb and things like that things that directly comes from the innovation

41:32.480 --> 41:40.000
that happened in material in the 2010s but that thing that happened in material in the 2010s so

41:40.000 --> 41:47.840
now that the big company left and and the thing they did are slowly trickling to other

41:47.920 --> 41:56.080
company is material like still relevant until life and actually yes so we still have

41:56.080 --> 42:00.400
multiple full-time job funded like there is a two of us and there is a few other people that

42:00.400 --> 42:06.160
kind of other full-time job depend on material we have a modern thought which is a 24

42:06.160 --> 42:12.960
of get-lamp so we can do like more just stuff but modern workflow when seeing like that and even if we

42:12.960 --> 42:20.000
still have about one percent of users one percent of users is still a lot of people and so

42:20.000 --> 42:24.960
it's not like nobody use it they're far less like you're not likely to meet randomly

42:24.960 --> 42:30.640
material users compared to meeting get users but there is still a lot of people that can

42:30.640 --> 42:37.520
company and sing like that and we still do a lot of innovation both in performance and in

42:37.520 --> 42:44.880
future and things like that for example last year I gave a liking talk on how material

42:44.880 --> 42:51.040
is able to show the diff between between two version of a sunset through rebays like in your

42:51.040 --> 42:56.000
ring the change in what is based on rebays you can check my liking talk to learn more about it

42:56.000 --> 43:02.160
and we also do more things in scaling and think like that like we are still innovative and one way

43:02.160 --> 43:09.360
we innovate is because we have a good cut-based that is fast we have good concept like CRDT

43:09.360 --> 43:16.960
that are very powerful and also we work with researcher in research lab to have like better

43:16.960 --> 43:22.160
algorithm that are actually new and unique like with you academic paper about oh this

43:22.160 --> 43:26.240
program investment control actually you could solve it better this way and that allowed to

43:26.320 --> 43:34.400
move to to keep the pace with the smaller team in a smaller use case and we still have a lot

43:34.400 --> 43:40.960
of uniqueness like the the fact that we do fully distributed history of writing is still unique to

43:40.960 --> 43:47.920
Macreal a lot of that like some part of that got moved into due to due to and also stuff but

43:47.920 --> 43:54.960
they didn't preserve the CRDT property that allowed to do a lot of things and so we still have

43:54.960 --> 44:03.600
some specific stuff and we also have a more approach more focus on being generally useful

44:03.600 --> 44:11.120
into that being tied to a large company needs and seeing like that a good example of that is that

44:11.120 --> 44:18.080
we have an approach that feels natural to us but in practice it don't to be quite unusual

44:19.040 --> 44:26.080
which is we try to solve the demo case first in the sense of some time when you have a problem

44:26.080 --> 44:31.920
in a specific situation you could just walk around the problem but if you actually solve the

44:31.920 --> 44:38.080
core of the problem it means you solve the core of the problem in more places and it means you are

44:38.080 --> 44:46.720
more universally useful for more you are user in in more case one example of that is the

44:46.720 --> 44:52.800
way Macreal scale we did kind of some scaling walk in the past year so we took a lot of like

44:52.800 --> 44:59.200
it's a good example we have people that have repo with like tens of millions of sense it

44:59.200 --> 45:05.760
millions of files and we still have a very fast status and we don't have a very fast status that

45:05.760 --> 45:11.280
cheating or taking shortcuts for all their specific use case like this is the same status code

45:11.280 --> 45:16.160
that weren't as for small people and for big repo and we get that things we can have very

45:16.160 --> 45:23.040
efficient action of putting revision or crowding as a graph and we don't do anything special to

45:23.040 --> 45:27.360
leverage the fact that it's centralized or leverage the fact that they have some specific

45:27.360 --> 45:33.280
infrastructure or some specific shape like we just make exchanging data faster and it makes

45:33.280 --> 45:38.160
their use case faster and every other use case faster and so instead of solving individual

45:38.160 --> 45:44.480
problem or indie field people by adding more codes for for specific special case we can keep

45:45.360 --> 45:50.560
code base and algorithm that are just efficient for everybody and solve more problem

45:51.680 --> 45:57.200
for a more people because they do that with like stock Macreal they don't have anything special

45:57.200 --> 46:02.080
type to their infrastructure like it's they basically they mostly just run Macreal and it just fast

46:04.320 --> 46:09.840
and we give these numbers but we still have a lot of of bedrooms like there is a lot of

46:09.840 --> 46:13.280
things we know are stupidly slow there is a bunch of bite and code that is slow

46:13.360 --> 46:19.200
algorithms that are slow that has to be true that is slow so if with a not that good code base

46:19.200 --> 46:23.200
and not that good solution we have this kind of of performance

46:26.320 --> 46:30.080
we there is a lot of room to keep with this approach of being general

46:31.440 --> 46:33.760
and there is a lot of challenge to solve regarding scale

46:33.920 --> 46:46.560
but yeah let me finish up we're not there yet unfortunately we still have bad default

46:46.560 --> 46:53.040
UX because it came from 2005 we still have many unfinished features that we're trying to get

46:53.040 --> 46:59.120
over the line we are still not shipping rust by default meaning that we have plenty of stuff that

46:59.200 --> 47:06.080
we are doing that is pushing the boundaries that people cannot use without going through

47:06.080 --> 47:09.680
extra steps individually this is like the product versus toolkit stuff

47:11.600 --> 47:14.400
do you really have the fastest status in the world if nobody runs it?

47:16.400 --> 47:21.760
so let's conclude why why do we work on Macreal still?

47:22.720 --> 47:29.520
we have happy users first of all which is nice we have funding still for an

47:29.520 --> 47:35.680
open source project that's pretty good we have a good toolkit that is sometimes best in class

47:36.880 --> 47:42.240
in different aspect we have a technological edge in certain areas and we have a conceptual edge

47:42.240 --> 47:49.840
in some other areas it is still universally useful we are not working at a company making a

47:49.840 --> 47:56.560
lot more money that we could then we're currently making and we're not working on other software

47:56.560 --> 48:03.520
because we think this is universally useful and that can be used for by everybody we're always

48:03.520 --> 48:09.040
moving quickly all of the stuff that we said we haven't really lost pace that much and we're

48:09.040 --> 48:17.120
still experimenting and helping advancing the field so it's next keep up the good work

48:17.120 --> 48:23.520
doing scaling the UX improvements actually ship it so that people can use it

48:23.520 --> 48:28.800
like it's actually a problem and we we can graduate to fixing it last year with terrible

48:28.800 --> 48:34.720
now it's better next year it will be great hopefully staying community focused meaning

48:34.720 --> 48:40.000
improving contributor experience our CI the tooling we have a sprint coming in London

48:40.160 --> 48:47.280
there's we're thinking about a new user experience so that we're not bound to our old one

48:48.240 --> 48:55.760
as much first class get interaction we know that the world runs on get so we have plans to make

48:55.760 --> 49:02.080
that work as well modernizing modernizing the UX in general that's what I just said having

49:02.080 --> 49:08.080
it a new way of presenting Macro to the to the world so what's in it for you if any of you

49:08.160 --> 49:14.880
still use Macro which we know you do because you raise your hand come talk to us please that would

49:14.880 --> 49:20.960
me very nice we can only help you if we know who you are if the progress is too slow join us

49:20.960 --> 49:26.720
come help out if you're interested get in touch we are a consulting company and we will like

49:27.520 --> 49:32.800
having your board otherwise you can wait 5 to 10 years and the stuff that we do will trickle

49:32.800 --> 49:39.440
down to your favorite tool you can get in touch at those links we have a sprint in London at

49:39.440 --> 49:46.480
those dates we have a BFF to talk to other like if you want to keep this conversation going

49:46.480 --> 49:52.000
at 3 PM today and we have two presentations tomorrow one in the rest of room and one in the

49:52.000 --> 49:58.800
software performance room thank you all for coming and yeah that was it

