WEBVTT

00:00.000 --> 00:12.000
So, our final talk of the day, rounding us off, is David Riddner.

00:12.000 --> 00:16.000
I'm feeling sorry.

00:16.000 --> 00:19.000
Maybe that's Mike.

00:19.000 --> 00:26.000
Talking about the adventures and oxidizing arch Linux package management, which is going to be very

00:27.000 --> 00:29.000
very exciting.

00:29.000 --> 00:30.000
Take it away.

00:30.000 --> 00:32.000
Yeah, thanks.

00:38.000 --> 00:41.000
First off, it's really nice to still see a bunch of people here.

00:41.000 --> 00:45.000
I know it's been a very long day for me too, I'm pretty tired.

00:45.000 --> 00:50.000
But, yeah, let's have a look at what we're currently working on.

00:51.000 --> 00:58.000
So, yeah, I'm trying to give like a grant overview of what the arch Linux package management actually

00:58.000 --> 00:59.000
means.

00:59.000 --> 01:00.000
Sorry.

01:00.000 --> 01:02.000
Is it not all enough?

01:02.000 --> 01:04.000
All right.

01:08.000 --> 01:11.000
Just try to hook it up a bit higher.

01:13.000 --> 01:15.000
Maybe that helps.

01:15.000 --> 01:16.000
I don't know.

01:16.000 --> 01:17.000
Is that good?

01:17.000 --> 01:20.000
I'm sorry.

01:20.000 --> 01:27.000
So, I'll try to give an overview of what arch Linux package management actually is, or in this context.

01:27.000 --> 01:32.000
And, yeah, talk a bit about the motivation behind this entire project.

01:32.000 --> 01:37.000
And, what we're currently tackling, or working on.

01:37.000 --> 01:41.000
So, yeah, first off a little bit of background about me.

01:41.000 --> 01:44.000
I'm a freelance software developer.

01:44.000 --> 01:54.000
I have been with Arch Linux for quite some time now, as a package maintainer, and developer, and signing key, etc.

01:54.000 --> 02:02.000
I do a bunch of rust, pro audio things, and there's lots of Python in the past as well.

02:02.000 --> 02:10.000
I have mostly spent my time with installation process and packaging, of course, like way too many packages.

02:10.000 --> 02:18.000
And, well, infrastructure topics, and the LPM project is somewhat an infrastructure topic itself as well.

02:18.000 --> 02:21.000
So, start with the obligatory.

02:21.000 --> 02:25.000
How many people in the room are actually using Arch Linux right now?

02:25.000 --> 02:27.000
Holy shit, that's a lot.

02:27.000 --> 02:28.000
Okay.

02:28.000 --> 02:29.000
That's great.

02:29.000 --> 02:32.000
That's probably 70% roughly in the room.

02:32.000 --> 02:36.000
For those that are listening in and have no clue.

02:36.000 --> 02:39.000
I try to give like a brief introductory.

02:39.000 --> 02:46.000
So, when we think of Arch Linux, we mostly think of Pac-Man, I guess, or often we think of Pac-Man as the project.

02:46.000 --> 02:49.000
This is very centric to the distribution.

02:49.000 --> 02:53.000
It consists of a package manager that is written in C.

02:53.000 --> 03:02.000
We have MayPIGG, which is a package build tool that is written in Bash, with which you build the packages that you then install later on with Pac-Man.

03:02.000 --> 03:09.000
We have tooling, like repo add, that you can use to, like, in a very rudimentary way.

03:09.000 --> 03:12.000
Yeah, deal with package repositories.

03:12.000 --> 03:24.000
And we have Pac-Man key, which is a thin wrapper around GPG, for handling our Pac-Man specific new GPG keyway.

03:24.000 --> 03:31.000
When we think of distribution packaging, so for Arch Linux itself, actually, then we're usually talking about deaf tools,

03:31.000 --> 03:40.000
which is, like, a collection of scripts and, well, a more unified experience, basically, to build in a clean C.H. route.

03:40.000 --> 03:55.000
And we have DB scripts that basically wraps the aforementioned repo add with many more bells and whistles, basically to deal with our repositories.

03:55.000 --> 04:02.000
And, yeah, many of you know, probably the AWR is, like, a platform of package scripts.

04:02.000 --> 04:10.000
And there's a set of, like, unofficial user repositories, of course, that are pre-built packages.

04:10.000 --> 04:19.000
And we have a bunch of AWR helpers that probably have a few use, I guess, to build and install things.

04:19.000 --> 04:32.000
But when we look into ALPM, then, so that's the short form for Arch Linux package management, then we actually look into, yeah, play in package building, I guess, in the beginning.

04:32.000 --> 04:38.000
We usually talk about, like, source repositories, where we have the PKG build, which is the build script.

04:38.000 --> 04:48.000
We clone that thing, we build it, we get a package for the new signet, and then we basically can also create a source info, which is, like, a representation of,

04:48.000 --> 05:06.000
the PKG build that is possible, because, yeah, having bash and the metadata in the PKG build is not really nice in certain context, right, if you want to show that on the website, you want to use bash for that, I guess.

05:06.000 --> 05:14.000
This is a little bit of an example, as I mentioned, like, this is all bash.

05:14.000 --> 05:25.000
So, PKG builds are literally just bash scripts that are evaluated, and that then build packages or lead to actual packages.

05:25.000 --> 05:35.000
So, familiar with bash, you will feel right at home, it's fairly easy to read usually, it's literally just build instructions installation instructions and things like that.

05:36.000 --> 05:45.000
Then we have the source info file, which is, well, that's what most, yeah, we can actually scroll.

05:45.000 --> 05:53.000
That is a bit of, like, an any style representation of the metadata that is the PKG build.

05:53.000 --> 06:07.000
It doesn't really contain any other information than the one that you will find in the PKG build, basically, so very static info, actually.

06:07.000 --> 06:17.000
You may wonder, I mean, even if you have been using an arch, like, what is a package actually, it's literally just a tar file, it's not really that magic.

06:17.000 --> 06:37.000
It contains all the files that you want to install on your system, but where it actually becomes interesting is when it comes to the files that describe the metadata about the package and the scripts that may be running, well, on your host when you install it or uninstall it.

06:37.000 --> 06:45.000
And this package metadata is largely comprised of a build info file, which describes the build environment, which the package has been built.

06:45.000 --> 06:53.000
The entry file, which is used only for very limited purposes, but literally is just a compressed lip archive entry file.

06:53.000 --> 07:02.000
And PKG info, which literally describes all the package metadata that is used then by the package management system.

07:03.000 --> 07:18.000
We also have, well, an understanding of scripts that basically can run, yeah, it's like, can run predefined functions on the host, as I said, on installation update or removal.

07:18.000 --> 07:32.000
These all run as rude, so it's a bit scary actually. We have this, but that's how many distributions actually deal with these post installation scenarios to modify the system.

07:32.000 --> 07:47.000
Yeah, this is a nice example of what the build info file looks like, as you see down here, it has some info about what it used to build from, which build tool it used.

07:47.000 --> 08:05.000
You see dev tools, which we use for package building. We do have a lot of the long list of packages that are installed in that build environment, and this is how you reproduce that standard on archlink of basically using this file on that set is pretty, it's pretty simple.

08:06.000 --> 08:17.000
Entry is not super interesting, I guess. You can also look it up, it's pretty well defined, I would say. It literally describes metadata about the files that are in contained in that package.

08:18.000 --> 08:46.000
More interesting is actually the peak at the info, because it literally gives you very detailed information about and also dynamic information about the package. If you look at the provides declaration over there, it literally gives you a versioned so name dependency, which you don't have when you're just looking at the static data from the source files basically.

08:46.000 --> 08:56.000
So yeah, this is literally what the packman then relies on to evaluate what information it is play, how to compare packages to one another.

08:57.000 --> 09:09.000
A package repository basically just contains the repository metadata, which you can see up there, it's the real DB repo files, more on that later.

09:09.000 --> 09:26.000
And just sinks the state of that package repository, which is basically described by these metadata files, and then downloads any package that it wants to update to validate set and install set.

09:26.000 --> 09:54.000
So we have two types on each that describe two things, basically we have the default one that describes packages, and here you see an example of just the package in version one, it has a description file, to which we will come in a sec, and the lower one has also an additional files file, which contains all the files.

09:56.000 --> 10:24.000
Literally, yeah, that's basically the short form of that. And nice example is here, this disk file describes the state of that package in that repository, it has some additional metadata, such as OpenPGB signature, but this is also kind of optional, it doesn't need to be in there anymore.

10:24.000 --> 10:35.000
It's just something that we still require for tooling reasons at the moment, but literally it contains a lot of the data that you've seen in the package again for before.

10:36.000 --> 10:43.000
It helps the package management system to make sense of like what is they are remotely and what it can upgrade to basically.

10:43.000 --> 10:52.000
The files files are super, super simple, they literally just contain a list of files and that's it.

10:53.000 --> 11:06.000
When we look at the local systems of the user system, then we will find the same files, the desk files, the files also the entry file, and they will basically describe the same thing.

11:06.000 --> 11:14.000
But there's a catch, the local desk file is different from the one that is in the repository metadata.

11:15.000 --> 11:30.000
You will have certain extra, such as validation, and you will have a reason why this is installed, et cetera, so that's there's some extra metadata encoded in these local desk files.

11:30.000 --> 11:38.000
And this is already part of the user systems database, basically, the state of the system that someone has currently on their system.

11:39.000 --> 11:44.000
We'll find that in a valid, a tag menu, you can find these files in the 3D.

11:45.000 --> 11:51.000
Yeah, the file is the same, the entry is literally the same, and it's the other one.

11:52.000 --> 12:08.000
So, yeah, having gone through all of these metadata files, which are like a dry topic, I guess, the question could be like, what's the motivation behind this entire thing, like why would you want to improve, or what would you want to improve, and why are we looking into this.

12:08.000 --> 12:20.000
So, one of the topics is that what we are using on Arch is a system that is, by now quite old, has roughly half of this.

12:21.000 --> 12:26.000
Thanks to left who actually sits in the audience here, thanks for doing good info.

12:26.000 --> 12:35.000
Yeah, we do have Artifact validation, as I mentioned earlier, based on a custom groupie g keywing.

12:36.000 --> 12:46.000
This is quite painful because it's brittle, it's stateful, and groupie g is no longer open pgp compliant actually, so that's pain.

12:47.000 --> 12:50.000
We need to do something about that.

12:51.000 --> 13:14.000
We do have a few closed loops within the context, if you're looking at Pacman as something that you want to consume as an outside project, then, well, I mean, it's nice that Pacman and Compassus Pacman and make gg, so it's the loop of the creation and the consumption is literally at one project, it's great, in some way, but.

13:15.000 --> 13:27.000
The changes to the internal file formats and so on, that are used by Pacman and see are introduced in May, pgg and bash, so it's, it's, it's not very pretty.

13:28.000 --> 13:35.000
And that also means that the changes to these internal file formats, they are defined by.

13:36.000 --> 13:47.000
Pacman releases, so if you're relying on a certain version or certain behavior of an internal file format, you might be broken by a Pacman update.

13:48.000 --> 13:53.000
If you're writing a piece of software that relies on our package ecosystem, basically.

13:54.000 --> 14:06.000
We do have, yeah, some file formats or most of them are not really clearly specified or defined, they don't really have versioning and no deprecation.

14:07.000 --> 14:18.000
That is kind of complex, as you can imagine, and it also means that certain behavior is literally just an implementation detail of Pacman or make bgg potentially.

14:19.000 --> 14:42.000
As I outlined earlier, yeah, producing this in bash is hard and leads to arrows that you can't really guard against easily, it's very hard without really strong unit integration tests, which we don't really have in a good way, I think.

14:42.000 --> 14:52.000
Yeah, so these file formats, they're not documented, which we want to do something about.

14:53.000 --> 15:05.000
The concepts surrounding them also, not necessarily clearly documented, they exist sometimes as footnotes in other documentation that relates to it, but they're not clearly defined.

15:05.000 --> 15:16.000
And that makes it often implementation specific, so it might be that you have a parser, and it can, as slightly behaves differently because there's no spec.

15:18.000 --> 15:32.000
This means that we don't really have anything else, but lip ALPM to link against currently, but that doesn't help us with all the file formats that we want to consume for metadata reasons.

15:32.000 --> 15:43.000
That leads us to either grab bash tooling in our own projects or to reimplement the wheel, basically.

15:44.000 --> 15:59.000
As examples of the former, you will see, like, debiscripts, which also is written in bash, also for historical reasons, but yeah, makes it hard to do certain things right.

15:59.000 --> 16:07.000
Because it's very hard to do transactions properly, deal with rollbacks and things like that.

16:08.000 --> 16:28.000
A project that's tried to literally reimplement the wheel to something we is reporting that try to improve on the concept of debiscripts by implementing, yeah, rollback and also transactional behavior for dealing with our package.

16:29.000 --> 16:52.000
The repositories basically needed to implement parsers and specs for all these file formats, again, because either they were not properly defined or only exist in untyped, yeah, context, basically.

16:52.000 --> 17:13.000
As I mentioned, we do have an issue with validation in that way. I think the very, for example, I actually bought this up before, I think in 2023, when I first started working on this project, is that, yeah, you basically can't stop a lot of stuff into our version comparison and would not complain about it.

17:14.000 --> 17:37.000
Our packages are limited after the fact, so after building them, we have a tool that literally lints over them. I think, lintian is a very similar approach in, in debions package ecosystem, but, yeah, this could happen earlier in the process and less after the fact.

17:38.000 --> 17:52.000
We do have, because of this, fact, a lot of existing parsers now that have very in degree of compatibility and that implement certain aspects of certain five formats, but not all of them and they're not official either.

17:52.000 --> 18:16.000
That brings us to the oxidation part, which is fun, I hope. As I mentioned earlier, we want to have more specifications for all these file formats or better specifications existing ones, because several versions of some of these file formats actually exist already, and they are still out there, basically.

18:16.000 --> 18:23.000
If you update to a newer version, techmen may not necessarily be able to consume all the versions.

18:23.000 --> 18:41.000
If you spoke about the GuQG topic earlier already, we do want to do something about this and move to something that is stateless and is actually totally agnostic of what you're using it for.

18:41.000 --> 18:52.000
Ideally, even cross-technology. That's why we've been working on this UAPI spec that is currently under review.

18:52.000 --> 19:05.000
I employ you to have a look at our approach at basically having a very generic approach to providing very fires for operating system artifacts.

19:05.000 --> 19:23.000
We are currently only a generic library for the lookup and for the use with OpenPGP exists, but this is an extensible format, basically, and maybe you find this interesting.

19:23.000 --> 19:42.000
When we think about AAPM as a project, as something that is a rust project that has extensible directions, then we have source package management repository and package roughly as topics, basically.

19:42.000 --> 20:03.000
In 2023, I started writing this library called AAPM types, which was supposed to contain a lot of common types that we use all across all of these metadata files, basically, to be able to validate them properly.

20:03.000 --> 20:10.000
By now, it also contains a lot of documentation for common concepts and also some file formats.

20:10.000 --> 20:18.000
The AAPM process library is something quite new.

20:18.000 --> 20:24.000
I think Orhun is probably also somewhere, I think there are some areas.

20:24.000 --> 20:32.000
Orhun started working on a lot of parsers for all these file types together with Anna, who isn't here anymore today.

20:32.000 --> 20:59.000
But this is mostly window based and has improved quite substantially over what the current status is, where we actually get useful error messages for users of these libraries when they try to consume certain file types, certain file formats, and so on, that actually gives you a meaningful response to what you're doing wrong.

20:59.000 --> 21:12.000
We do have a library that is just internally for testing, basically it allows us to integration test against all the live data that we have, speak the entire package set, all of it.

21:12.000 --> 21:19.000
Literally, it is pretty fast.

21:19.000 --> 21:27.000
What we're currently looking into or trying to close up on is dealing with the AAPM source side of things.

21:27.000 --> 21:41.000
They are mostly looking into having a clear specification for the source info file format, and a library that allows us to pause and serialize the source info file format.

21:41.000 --> 21:48.000
It's currently under review, but it's like 90% done, basically.

21:49.000 --> 21:56.000
When we're looking into the package domain, then we have achieved quite a bit over the last few months already.

21:56.000 --> 22:05.000
We have a specification for the InstaScriptlet, for the entry format, for the built info and the PKG info format.

22:05.000 --> 22:16.000
Likewise, we do have parsers and serializes for these formats now, that are already functional by now. That's pretty nice.

22:19.000 --> 22:26.000
There's still a ton of work to be done, as you can imagine. I mean, there's lots of ground to cover.

22:26.000 --> 22:40.000
We do want to upstream a lot of the stuff that we currently work on. Those are literally changes to make PKG report at in the future.

22:40.000 --> 22:48.000
It's made PKG very specifically for things like source info, for PKG info and also for built info.

22:48.000 --> 23:05.000
Creation, because then we can actually have validated file formats in the packages that are following some form of the design standard in a way, and would give you a proper error message when it fails.

23:06.000 --> 23:14.000
We do want, although that is more like on the on the back burner at the moment, we do want to have export further languages.

23:14.000 --> 23:19.000
As we do have some tooling integration, for instance, for LIPAPM.

23:19.000 --> 23:24.000
That's called PIRPM. That's a wrap around this.

23:24.000 --> 23:37.000
Obviously, the defense of all of this work, we do have the plan to provide a drop-in replacement for LIPAPM in the future.

23:37.000 --> 23:47.000
For this, we would need to integrate the VOA specification that's basically the one for the verification of artifacts.

23:47.000 --> 23:51.000
We need to implement a lot of stuff for that and Ripple thinking and package download.

23:51.000 --> 23:58.000
The installation and upgrade removal, etc. That is needs to be compliant with what is currently there.

23:58.000 --> 24:05.000
But as for instance, LIPAPM links against GPGME, we don't really want to do that at all.

24:05.000 --> 24:16.000
So drop-in replacement in this case means that in quotes, because we're not going to link against GPGME.

24:17.000 --> 24:28.000
In the future, we would also like to look into the possibility of unifying some of these existing five formats, because they do share a lot of commonalities.

24:28.000 --> 24:40.000
They have a bunch of overlap that we may just be able to better describe in a structured data format going forward, because that's way easier to pass.

24:40.000 --> 24:51.000
Writing the process was quite challenging, I would say, due to little hoops here and there in these file formats.

24:51.000 --> 24:58.000
They're not as trivial as they may seem when you look at them for the first time, because they're not quite any.

24:58.000 --> 25:04.000
They're not quite this or that, and makes it very hard.

25:04.000 --> 25:24.000
Given that we want to have the creation part for these files, also covered for the repository, this would actually allow us to have better tooling and improve our tooling around repository handling in the future.

25:24.000 --> 25:36.000
This is more like a midlong term goal, I would say, maybe more like next year, I would say, although some of this stuff may already be added to this year.

25:36.000 --> 25:53.000
Locally this means that, yeah, we need to have a bunch of libraries that still need to be written and file tabs that need to be described, et cetera, et cetera, a lot of fun, but also, yeah, lots of interesting work to be done.

25:53.000 --> 26:17.000
Funnily enough, this was nicely funded by the sovereign tech agency by the, I think, October we started working on this with a team of four people, so this funding will go on until the end of this year.

26:17.000 --> 26:39.000
We hope to cover a lot of ground literally to be able to provide something that is also sustainable for the future and is able to improve what we use for packaging and how we deliver packages to our users in the future.

26:39.000 --> 26:45.000
You can contact us, well, you can first of all read a lot of documentation if you want to on the website.

26:45.000 --> 26:55.000
We have a repository that combines a bunch of crates and we do hang out on the DC, you can also join that on the matrix if you want.

26:56.000 --> 27:00.000
Yeah, there's my mail address, et cetera, and here's social.

27:00.000 --> 27:13.000
If you're on the figure, other than that, here's a slight link, if you want, I'll put that on the website later, but this is, yeah, with your code, if you want to read the thing.

27:13.000 --> 27:20.000
There's lots of links in there, so if you're interested, it's probably quite nice.

27:20.000 --> 27:27.000
But that is also literally it, I'm probably a bit fast.

27:27.000 --> 27:35.000
If you have any questions, I'll be glad to answer them.

27:35.000 --> 27:57.000
Let's see if I can hear it actually.

27:57.000 --> 28:12.000
Yes, so you said that you want to upstream the changes and provide drop in replacement, but you also said that, for example, the file formats, they changed with the packman updates.

28:12.000 --> 28:24.000
So what if for example, there's a big packman change plan, and then, suddenly, it will not make it compatible with the current implementation.

28:24.000 --> 28:37.000
And would it be possible to basically have two buckets like one, the written in C and one in Rust for packman of the library?

28:37.000 --> 28:41.000
I'm not entirely sure I got the second, the last sentence properly.

28:41.000 --> 28:44.000
I mean like, would it be going forward?

28:44.000 --> 28:51.000
It's a little unclear to be honest, because we're also still collecting information and collecting ideas around this.

28:51.000 --> 28:59.000
As you can imagine, I mean, a lot of this work has been done over the last 20 years, and it's a second accumulated process, right?

28:59.000 --> 29:12.000
So even if you're going in the wrong direction, you might not necessarily realize that right away, you might only realize that like 10 years later, and you're like, wait a second, this may be not the greatest idea, why it did we actually do this.

29:12.000 --> 29:35.000
And the thing here is that while we're accumulating these ideas, how to improve the situation, how to unify it, it definitely becomes clear like that there's overlap, for instance, if you look at built-in for and PKG info, there's some overlap, but there's also like the idea of separation of concerns for these things.

29:35.000 --> 29:45.000
So in an ideal world, you won't need like special super tooling to make sense of a metadata file for your own purpose.

29:45.000 --> 30:01.000
So we need to kind of weigh the pros and cons here as well, but I believe that nonetheless structured data format that is relying on already existing standard would be better here for sure.

30:01.000 --> 30:08.000
We also tried to replace the make package and the make package scripts.

30:08.000 --> 30:14.000
Well, that's not really on the agenda for the work that is currently sponsored.

30:14.000 --> 30:30.000
I mean, there are actually some proof of concepts by Morgan, who has been part of the packing team for some time that worked on,

30:31.000 --> 30:43.000
basically replacing the make package tool with a rust implementation that has been around like one and a half years, so I think.

30:43.000 --> 30:59.000
Our current ideas actually to first of all replace or make it possible to replace the creation of these metadata files because they're currently created in bash and there might be just general trash in that we don't want.

31:00.000 --> 31:09.000
And we've seen this in some package files where we were testing, and it would be nicer to be more robust for that for sure.

31:09.000 --> 31:28.000
And that is easier to just replace like executables in make the G itself and call differently depending on how you configure the build, but as is there's no like direct action from our site to replace make the G.

31:29.000 --> 31:33.000
Hello, sorry, it's just maybe a silly question.

31:33.000 --> 31:35.000
Can you speak up a bit?

31:35.000 --> 31:36.000
Can you hear me?

31:36.000 --> 31:37.000
Yes.

31:37.000 --> 31:38.000
Okay.

31:38.000 --> 31:46.000
So I was going to ask you maybe a silly question, but I want you to know if a live LPM is going to be a dynamic library,

31:46.000 --> 31:50.000
when you're going to rewrite it in the rest.

31:51.000 --> 32:09.000
Yeah, I mean the idea is to have a drop-in replacement that would also be a shared library, literally offering the same CAPI with air quotes because of the GPG and the specifics.

32:09.000 --> 32:25.000
We do have some specific buy-in, basically, for these for for Bnukiji and that's something that we need to look into, but that's something that we do not want to reproduce because it doesn't fit the way at all.

32:25.000 --> 32:26.000
Okay.

32:26.000 --> 32:27.000
Thank you.

32:35.000 --> 32:38.000
Okay, so if there's no more questions, thank you David.

32:38.000 --> 32:39.000
Yep.

32:39.000 --> 32:40.000
Thanks.

