WEBVTT

00:00.000 --> 00:10.000
This next session is going to look at project maturity through a very specific lens,

00:10.000 --> 00:15.000
release votes in the Apache incubator, from the outside it can seem pretty procedural,

00:15.000 --> 00:19.000
but over time it reveals a lot about readiness, quality and shared responsibility.

00:19.000 --> 00:21.000
Please welcome Justin McLean.

00:22.000 --> 00:28.000
Welcome everyone, I am Justin McLean and I am from Sydney, Australia.

00:28.000 --> 00:37.000
This is my first Boston, so it's good to see everyone here and it's great to see so many smart

00:37.000 --> 00:40.000
and passionate people in the open source space.

00:40.000 --> 00:48.000
So this talk, why I'm talking about the Apache incubator, this is not a talk about the ASF.

00:49.000 --> 00:54.000
You're going to have some practical stuff that you can take away based on this,

00:54.000 --> 01:00.000
and this talk is about shipping code and about how the process around shipping code

01:00.000 --> 01:04.000
can either amplify stress or reduce it.

01:04.000 --> 01:09.000
So again, this talk is not about ASF policy, it's about lived experience

01:09.000 --> 01:12.000
and it's also based on actual data.

01:12.000 --> 01:19.000
Now, this is where we're going to find out what this works for not.

01:23.000 --> 01:30.000
So I recently looked at a decade of releases and this was not just for one project,

01:30.000 --> 01:41.000
it was from a total of 33,000 emails, 1,600 vote release threads and 160 projects.

01:42.000 --> 01:44.000
So that's a lot of data.

01:44.000 --> 01:51.000
What was interesting about that is that there's a lot, the 160 projects were quite diverse.

01:51.000 --> 01:54.000
There were some very small projects, there were some large projects,

01:54.000 --> 02:00.000
there were written in different languages, they had different communities behind them.

02:00.000 --> 02:06.000
And some of those projects, I didn't exclude any projects either, so some of those projects didn't make it.

02:06.000 --> 02:16.000
Some of them failed, some of them retired because there was no community interest in maintaining them anymore.

02:16.000 --> 02:22.000
And that's sort of what matters because we had this wide set of data, which to draw things from.

02:22.000 --> 02:24.000
It's not working.

02:24.000 --> 02:31.000
So the other interesting thing that came out from the data is that release processes are not neutral.

02:31.000 --> 02:38.000
They are released will actually quietly answer some important questions for your contributors.

02:38.000 --> 02:48.000
And expectations learned through experience of people's involvement with a release rather than your documentation.

02:48.000 --> 03:04.000
And the problem that the incubator had a decade ago was that we had a whole set of rules around releases.

03:04.000 --> 03:17.000
And the way that those rules were conveyed to contributors and people working on these projects was that it was learning more by punishment than it was by iteration.

03:17.000 --> 03:28.000
And these releases will say up here, this releases in code power and what I mean by that is that when you come to a release and you're voting on it,

03:28.000 --> 03:43.000
it means that you find out who can say no, who can hold up the release, who can block progress on it, who can who needs to justify themselves in order to carry this release forward.

03:43.000 --> 03:54.000
And also what, you know, effort that people need to go back, go for, and also how the feedback on the releases delivered and all of this matters.

03:54.000 --> 04:07.000
And if the process feels like a peak or, as I said, punishment, people are going to maybe not make a release or they're going to delay making a release, do they think it's good enough?

04:07.000 --> 04:25.000
But if it feels, you know, clear and survivable, then people are going to try and actually do this. So your release process actually shapes behavior well beyond just legal policy or, you know, other policy compliance.

04:25.000 --> 04:30.000
And that was one thing that was very, very clear from from this data.

04:30.000 --> 04:43.000
And as I said, about a decade ago, the releases through the incubator were technically correct, they followed policy, but they were socially damaging.

04:43.000 --> 04:50.000
And the reason why that happened, and this wasn't malicious, this just came about by accident.

04:51.000 --> 05:07.000
It was, um, first off, there was a lot of checkbox policing. So, you know, a release must have these things, there's a whole one checkboxes, people would go through and check everything off, and if it didn't have something, they said, okay, that's broken, we got to do it again.

05:07.000 --> 05:15.000
There was also a lot of rules that were not documented. These rules lived in people's heads that was tribal knowledge.

05:15.000 --> 05:24.000
And so it was very easy to make mistakes. People could come along and, and thought they've done the right thing, but didn't really realize this.

05:24.000 --> 05:36.000
And the only documentation that existed was the mailing list archives, which is really hard to go through and read all the information from those and try and work out what, in some cases, these unwritten rules were.

05:37.000 --> 05:49.000
So, the review feedback often became the place where the rules were learned, and it wasn't something you could actually prepare for, which makes things difficult.

05:49.000 --> 06:01.000
And as I said before, this wasn't malicious, it's just how the way things evolved, and the decade before this one that I looked at was probably even more chaotic, and because that was when the rules were actually being formed.

06:01.000 --> 06:08.000
So, there was lots of arguments about what, you know, should it, shouldn't be, and how religious it should be made and all the rest.

06:08.000 --> 06:17.000
So, if we go back to this point in time, so the project's first release would 90% of the time would probably fail.

06:17.000 --> 06:28.000
And there would usually get a large list of issues, and they would have to be fixed, and then it would take several weeks for those to be fixed, before another release candidate was tried again.

06:28.000 --> 06:35.000
And these days, about 75% of first releases actually pass.

06:35.000 --> 06:40.000
It actually has been a dramatic improvement over that decade.

06:40.000 --> 06:46.000
And if there are some issues, they generally minor, and they get fixed fairly quickly.

06:46.000 --> 06:50.000
Now, one is really important to understand here is that the rules themselves didn't change.

06:50.000 --> 06:53.000
The rules are still the same as what we had 10 years ago.

06:53.000 --> 07:00.000
It was the experience of the people involved that has changed.

07:00.000 --> 07:11.000
And what happened was, it was the same rules just a completely different learning experience.

07:11.000 --> 07:20.000
And what the main change was in review behavior, and the shift didn't happen automatically, it happened over time.

07:20.000 --> 07:27.000
I think there's some reasons for why this happened, but the data doesn't tell me that.

07:27.000 --> 07:35.000
And mentors on projects shifted from saying, that's wrong.

07:35.000 --> 07:40.000
You need to fix that, to say, well, that's not quite right.

07:40.000 --> 07:49.000
And here's why it's not right, because it doesn't follow this principle that we have, and this is how you can fix it.

07:49.000 --> 07:55.000
And so saying why something matters, how to fix it, and to give some reassurance to the project,

07:55.000 --> 08:03.000
that things are actually on a good path, and that you're next release is going to be there,

08:03.000 --> 08:10.000
is a much better way of approaching this than just saying, you haven't done this, start again.

08:10.000 --> 08:17.000
And again, it's the exact same rules, and just a completely different learning experience.

08:17.000 --> 08:33.000
And it makes that means that people will no longer afraid to be involved in making releases, and more people will willing to try.

08:33.000 --> 08:46.000
So over that time, once people's fear, or concerns are removed, they are much more likely to make the releases they're behavior changed.

08:46.000 --> 08:54.000
Projects attempted releases earlier, they iterated faster, and they recovered more quickly from any mistakes.

08:54.000 --> 09:04.000
There's been some other changes over that decade that some of them have been complementary, but this is mean the main changes that has an impact.

09:04.000 --> 09:13.000
Other minor changes, like tooling, for example, to automatically find most of these issues in a release, have certainly helped,

09:13.000 --> 09:27.000
but that's not the whole story. It basically happened because shipping code and creating a release became psychologically safer.

09:27.000 --> 09:34.000
The other interesting thing that came out from this is that releases are not just not artifacts,

09:35.000 --> 09:40.000
but releases are actually onboarding material for your community.

09:40.000 --> 09:43.000
And this applies to every release, not just the first one.

09:43.000 --> 09:52.000
Each release teaches what done means, how strict you have to be, to follow your rules.

09:52.000 --> 10:01.000
Guide loans, whatever you have, and with a mistakes are fatal, will easily recover from.

10:01.000 --> 10:09.000
And the first release is probably doubly important because that's six expectations for future releases.

10:09.000 --> 10:19.000
And it's still the case though, because you get new contributors that are being involved in your project, that every release is going to be an onboarding experience for the people who are involved with that release,

10:19.000 --> 10:26.000
and the people who are watching as well, not just the people who are actually creating the release.

10:26.000 --> 10:36.000
Now at this point, I should probably point out that not all open source software does releases the same way that ASF does, but the same still applies.

10:36.000 --> 10:44.000
If your project doesn't do a formal release, they're going to have GitHub tags, they're going to have packages that they make in some way.

10:44.000 --> 10:50.000
They're going to have container images, or even a stable API that's called a release.

10:50.000 --> 11:00.000
So every single one of those things can be seen as a way of onboarding new contributors to your project.

11:00.000 --> 11:10.000
The other thing that came out from the state of, as I said, 160 projects, was there's a huge impact with the first release.

11:10.000 --> 11:18.000
The first release gives a project momentum, and that means more people get involved.

11:18.000 --> 11:28.000
And looking at the entire day there, it's the clearest indicator of a project's trajectory, whether or not they're going to succeed or fail.

11:28.000 --> 11:46.000
And if a project makes a release early, in the day that I was looking at, we're talking six months within six months, then they attract contributors, form governments, make new releases, quicker and quicker, and generally they survive and become a top level project.

11:47.000 --> 12:01.000
Projects that don't make a release quickly and leave it, you know, say for a year, or so forth, tend to stay small, they tend to stagnate, and sometimes they don't succeed.

12:01.000 --> 12:05.000
And this was a really clear outcome.

12:05.000 --> 12:16.000
And over the decade, usually took the medium time for a release was about seven months, initially, and it's now got down to four or five months.

12:16.000 --> 12:24.000
So we've seen a great improvement in that as well.

12:24.000 --> 12:34.000
So one thing I should make clear here though, is that earlier releases don't imply lower standards.

12:34.000 --> 12:42.000
In these cases here, the quality bar didn't drop, it just became more visible.

12:43.000 --> 13:02.000
And early on, when people come up, early on the project attempted releases, that there were mostly common technical failures, like, you know, issues with licenses and notice files and trademark issues and so forth.

13:02.000 --> 13:12.000
Now, with those sort of manual privileges have declined, and have declined for a number reasons, one is that there's some automated tooling in place.

13:12.000 --> 13:17.000
The other is that there is better documentation that exists.

13:17.000 --> 13:29.000
Another factor, and I haven't actually done any research into this, we have a course set of mentors that the mental, the mental, multiple projects, and they've gained experience over time.

13:29.000 --> 13:44.000
So I think that has definitely helped as well, but more importantly, there's a shared understanding amongst all the people involved, so they know what to do.

13:44.000 --> 13:50.000
And that has come up, you know, it's taken 20 years to get this point.

13:50.000 --> 13:54.000
It didn't happen overnight, that's for sure.

13:54.000 --> 14:06.000
And most of the issues that we run in today, and not issues that tooling or CI or AI can find, it's the sort of issues that require human judgment.

14:06.000 --> 14:19.000
And that's basically things like licensing intent or providence of code or things along those lines.

14:19.000 --> 14:39.000
So the automation and tooling is removed some mechanical noise, but it's just gotten rid of the low hanging fruit, and we still need humans in loop to actually create these and their judgment and mentoring certainly helps that.

14:39.000 --> 14:54.000
So the other thing is that people tend to follow momentum, so I was talking about this first release effect, and that attracts people to your project, that attracts new contributors.

14:54.000 --> 15:01.000
More road maps that attract contributors, it's actual progress that does that.

15:01.000 --> 15:15.000
And one other impact that we could see in the data is that projects that released early also tended to add new contributors to their project much earlier.

15:15.000 --> 15:23.000
So again, a decade ago, it generally took a year for some projects to add new contributors to their project.

15:23.000 --> 15:31.000
Now it's under six months, and that has a huge impact.

15:31.000 --> 15:44.000
So not only, if others people outside your projects, seeing that contributors have been added to the project, they're going to stay around for longer because they think, oh, maybe that could happen to me as well.

15:44.000 --> 15:50.000
It also means that the earlier that people are being added to a project, it removes some risks.

15:50.000 --> 16:00.000
And those risks include things like single Martina projects, so you have more people involved. It also helps with vendor neutrality.

16:00.000 --> 16:12.000
Because you've got more people involved, they're probably employed by multiple vendors, and it means that no one is setting the direction of the project, and it's a shared effort.

16:12.000 --> 16:21.000
The other thing is that it's really easy to have a look at a project and look at its commits, and see what's going on there, that commits can lie.

16:21.000 --> 16:32.000
Particularly now, it's a whole lot of automated tooling, the pen button and all sorts of other things that will mess that statistic up.

16:32.000 --> 16:41.000
So it's actually, you can very easy look at release cadence and treat that as a measure of community health.

16:41.000 --> 16:59.000
If a project is releasing on a regular cycle, that's great. If you start to see gaps, that could be early warning signs of coordination problems will burn out or governance issues, fragile ownership.

16:59.000 --> 17:09.000
So if you actually want to look at an open source project and see whether it's healthy, just look to see whether it has made releases and how often it makes releases and how often it adds new contributors.

17:09.000 --> 17:16.000
And that's going to give you a really good idea about how healthy your project is.

17:16.000 --> 17:23.000
So in those decades as well, that decade I should say, the governance didn't get lighter, it was still there.

17:23.000 --> 17:35.000
It was just a little quieter, because those early events, crisis, maybe in some cases didn't happen.

17:35.000 --> 17:44.000
There were fewer escalations. So there were people who had a more a better shared knowledge of how to make a release and it was better documented.

17:44.000 --> 17:50.000
So there were fewer confused projects and they actually made more releases and more releases got shipped.

17:50.000 --> 17:58.000
So the fact that there wasn't, there's now less escalation than there was before, doesn't mean there's less oversight.

17:58.000 --> 18:11.000
But it just means that the oversight is happening earlier and more gently and the projects have a better understanding of what needs to be done.

18:11.000 --> 18:14.000
So what to take back to your project?

18:14.000 --> 18:23.000
So as I was saying, you know, I've been looking at this from an ASF lens, but this is applicable to open source in general.

18:23.000 --> 18:35.000
So you should treat making a release, we're shipping software, however you do it as onboarding, both through existing contributors and for new potential new contributors as well.

18:35.000 --> 18:43.000
And you need to make those expectations around that release, explicit document what your release process is.

18:43.000 --> 18:49.000
So you know, these are the things that must happen, these are things that are nice to have.

18:49.000 --> 19:01.000
You know, if we make a mistake here, we need to, we must fix it in this release or we can wait until the next release as long as there's incremental improvements that that go on.

19:01.000 --> 19:08.000
So you don't want to rely on people learning the rules during a release review.

19:08.000 --> 19:22.000
You want to take checklists and examples and templates and anything that helps reduce that anxiety and, you know, shorten feedback loops.

19:22.000 --> 19:31.000
You want to also, when something does go wrong, explain why it is an issue, don't just say it's wrong.

19:31.000 --> 19:40.000
Suggest how to fix it, point to the underlying value or reason to why why this is something that is important.

19:40.000 --> 19:48.000
Also say whether it actually is a blocker for this release and whether or not it can be fixed in the next one or at a later time.

19:48.000 --> 19:55.000
The other thing you can do is track time to ship and time to first new person in new projects.

19:55.000 --> 19:59.000
That's going to give you a really good indicator of their health.

19:59.000 --> 20:08.000
And I think these four things here, none of these are specific to the ASF, they will apply to all open source projects.

20:08.000 --> 20:15.000
But the real reason that we've got here is that releases are actually community design.

20:15.000 --> 20:20.000
They're not just legal compliance or how to follow policy.

20:20.000 --> 20:24.000
They have a much larger impact on your community that you may think.

20:24.000 --> 20:33.000
And the question isn't whether you enforce standards, it's whether you let people feel safe enough to be able to meet them.

20:33.000 --> 20:40.000
And the right process you have around your releases is going to lower confusion and fear that people have.

20:40.000 --> 20:47.000
And when that goes away, people are going to stay, they're going to learn and they're going to contribute more.

20:47.000 --> 20:51.000
So that's it for me, do I have any questions?

20:52.000 --> 21:01.000
Anyone questions?

21:01.000 --> 21:03.000
Thank you.

21:03.000 --> 21:05.000
Have a good rest of your day.

21:05.000 --> 21:10.000
And if you have any questions for me, come find me around the conference if you can and I'll be happy to answer them.

21:10.000 --> 21:13.000
All right, folks, we have about five minutes.

