WEBVTT

00:00.000 --> 00:10.600
All right, so next up is Mark.

00:10.600 --> 00:14.160
We'll be telling you about Python packaging.

00:14.160 --> 00:15.160
Indeed.

00:15.160 --> 00:16.160
Thank you, Bjorn.

00:16.160 --> 00:17.660
Yes, my name is Mark Ryan.

00:17.660 --> 00:21.060
I work for Revolce and we're going to be spending the next 30 minutes talking about

00:21.060 --> 00:22.560
my favorite subjects.

00:22.560 --> 00:25.600
Python package installation on risk 5.

00:25.600 --> 00:28.920
I don't know how many of you have actually tried using Python on this 5.

00:28.920 --> 00:32.160
Those of you who have have probably enjoyed a bit of a mixed experience.

00:32.160 --> 00:35.920
For those of you who haven't, I'm going to demonstrate to just how frustrating and experience

00:35.920 --> 00:36.920
it can be.

00:36.920 --> 00:39.320
I'm going to do this using a little video I pre-prepared.

00:39.320 --> 00:43.920
This is a video taken of me typing away on a risk 5.

00:43.920 --> 00:45.920
You can see that it is a risk 5.

00:45.920 --> 00:47.720
The first thing I'm going to do is start Python.

00:47.720 --> 00:49.520
As you can see, Python works nicely.

00:49.520 --> 00:54.040
You can start it up, you can run Python programs, and better still, you can install Python

00:54.040 --> 00:55.040
packages.

00:55.040 --> 00:57.920
In this case, I'm installing something called requests, which is a very popular Python

00:57.920 --> 01:01.920
package for sending HTTP requests, as you might imagine.

01:01.920 --> 01:05.480
It is installing without problem, it's also installing a lot of extra dependencies, which

01:05.480 --> 01:07.160
install without problem.

01:07.160 --> 01:11.120
To prove it works, I'm going to start the interpreter, import the package, and try and send

01:11.120 --> 01:12.120
a request.

01:12.120 --> 01:16.280
I apologize about the slow typing here, but we're sending it requests to the false

01:16.280 --> 01:26.280
and website, and we should get some text back soon.

01:26.280 --> 01:28.680
Where it works, okay.

01:28.680 --> 01:33.880
Python sort of works, will it does work, but let's try and install a different package.

01:33.880 --> 01:36.280
This time we're going to install something called sentence piece.

01:36.280 --> 01:39.280
Now, sentence piece is a word tokenizer.

01:39.280 --> 01:43.280
It's used in a lot of AI and machine learning workloads.

01:43.280 --> 01:46.680
You're going to notice it's going to take a little bit longer to install the installation

01:46.680 --> 01:50.680
script looks a little bit different, it's downloading a tarot.gz file, it's saying something

01:50.680 --> 01:52.880
scary, installing build dependencies.

01:52.880 --> 01:56.240
I'm going to let you into a little secret, this is not going to work, and it's going

01:56.240 --> 02:05.480
to fail rather spectacularly in maybe five seconds, and there we go, you've got a lovely

02:05.480 --> 02:11.360
error message, that is not a great user experience, I think you can all agree.

02:11.360 --> 02:15.040
So why is it that we can install some Python packages on this file without problem

02:15.040 --> 02:16.520
and others we can not install?

02:16.520 --> 02:19.360
Well, it turns out there's two different types of Python packages, there are pure Python

02:19.360 --> 02:25.080
packages, those written purely in Python, and there are binary packages, which are written

02:25.080 --> 02:29.600
Python and a mixture of C, sometimes C++, and increasingly Rust.

02:29.600 --> 02:33.680
You can download pure Python packages from Python, which is a central repository that stores

02:33.680 --> 02:37.800
Python packages on this file without problems, but you cannot download binary packages,

02:37.800 --> 02:42.360
and the reason is that Python will allow you to upload binary wheels for risk by 64, and

02:42.360 --> 02:48.760
a wheel is the file format for distribution Python packages, and it's not just sentence

02:48.760 --> 02:53.640
piece that's affected, a lot of the very popular Python packages, it's important Python

02:53.640 --> 02:58.600
packages of binary packages, if you think about things like NumPy, Cyphy, Cycup Learn,

02:58.600 --> 03:03.880
Pandas, Pillow, TensorFlow, PyTorch, PyRow, all of these things are binary packages,

03:03.880 --> 03:08.960
so they cannot be easily installed on RIS5, 64 devices at the moment.

03:08.960 --> 03:12.880
So just stop and think for a little bit about what a serious problem that is for the vendors

03:12.880 --> 03:17.520
of RIS5, 64 devices, they're going to have customers, their customers are likely to have

03:17.520 --> 03:22.440
RIS Python workloads, and those workloads are going to depend probably on one of

03:22.440 --> 03:25.320
more of these packages either directly or indirectly.

03:25.320 --> 03:29.560
If you look at the bottom of any sufficiently complicated Python stack, you're going to find

03:29.560 --> 03:33.600
NumPy at the bottom, and it's not possible to easily install NumPy at the moment.

03:33.960 --> 03:35.960
On RIS5, 64 devices.

03:35.960 --> 03:39.960
Now, you might be thinking that I'm over at dramaticizing this, and there is a solution,

03:39.960 --> 03:43.280
you could just install from source and pip allows you to do this.

03:43.280 --> 03:47.680
So if you remember, when we were trying to install sentence piece, what pip did, is it

03:47.680 --> 03:51.800
looked to see if there was a RIS5 binary that was and so it downloaded a source distribution

03:51.800 --> 03:56.600
and tried to install that, the installation fail, because I didn't have the right, build dependencies

03:56.600 --> 04:01.680
on my VM, but you might think that is a possible solution to your problems, but allow me

04:01.680 --> 04:03.600
to just abuse you of that notion.

04:03.600 --> 04:06.800
Building binary packages is very, very difficult.

04:06.800 --> 04:11.720
It's complicated process and it's very, very difficult to get right, and I'm going to demonstrate

04:11.720 --> 04:16.680
how difficult it is using some VMs that I have pre-created.

04:16.680 --> 04:20.440
So it's time for the live demo, which I really hope this works.

04:20.440 --> 04:22.360
What I have here is I have two VMs.

04:22.360 --> 04:26.200
On the right hand side, yes, it says noble.

04:26.240 --> 04:32.080
So that is Ubuntu 2404, and on the left hand side, I have jammy, that is Ubuntu 2204.

04:32.080 --> 04:36.440
I hope I said that right, noble is 2404, jammy is 2204.

04:36.440 --> 04:40.880
I'm going to refer them to them as noble and jammy from now on, because it's really difficult

04:40.880 --> 04:42.600
to say those version names.

04:42.600 --> 04:49.560
These are running on directly on my Mac, so there are AR64 VMs, I'm using QMU to run these.

04:49.560 --> 04:54.480
And they both have a shared folder, which is goal, falls them, well, you can see it there.

04:54.480 --> 05:00.760
It falls them 2505, so I'm going to use that share folder to copy files between the two VMs.

05:00.760 --> 05:03.880
Now the first thing I'm going to do is I'm just going to try and build send it to some source.

05:03.880 --> 05:07.040
I'm going to build it on noble, I want to build it on noble, because that's the more recent

05:07.040 --> 05:10.080
distribution, so it has a more recent tool chain.

05:10.080 --> 05:13.120
So I'll get all the laser stock to my stations.

05:13.120 --> 05:14.920
So I've got a little script to build it, actually.

05:14.920 --> 05:17.920
Let me just show you what that script is doing, it's not doing anything magic, it's just

05:17.920 --> 05:21.920
to save me from messing up on the typing, it's activating a virtual environment in which

05:21.920 --> 05:25.480
I'm going to preinstalled all the dependencies you need to build send it to space and then

05:25.480 --> 05:27.640
it just builds it with it.

05:27.640 --> 05:35.800
So let's build it, this should take about 10 seconds, I think, so we'll pause for a minute

05:35.800 --> 05:48.160
while it builds, it's got a faster laptop, yes, it's finished, okay, and if you see it's

05:48.160 --> 05:52.120
created this file here, it's great, this wheel file here.

05:52.120 --> 05:59.480
Now what I'm going to do is I'm going to copy this file, and I'm going to copy into my shared

05:59.480 --> 06:09.480
folder, oops, and we're going to switch over to our jammy, I should have done this while

06:09.480 --> 06:14.160
it was building.

06:14.160 --> 06:17.520
We're going to switch over to our jammy VM and we're going to try and install it, and

06:17.520 --> 06:23.600
actually I'm going to just create a version environment in which to install it.

06:23.600 --> 06:32.040
So let's install our wheel, and it fails and it fails straight away with a rather confusing

06:32.040 --> 06:33.040
error message.

06:33.040 --> 06:36.360
It says the wheel is not a supported wheel on this platform.

06:36.360 --> 06:40.320
Now to understand what this means, you can look at the file name, the file name actually

06:40.320 --> 06:42.480
gives you a clue.

06:42.480 --> 06:46.200
The file name is composed of a number of different parts, we have the name of the package,

06:46.240 --> 06:50.680
we have the version number, and then we have these two tags here, and these indicate the minimum

06:50.680 --> 06:54.960
version of Python required to run this package, and also the Python ABI against which

06:54.960 --> 06:58.880
the package was compiled, and they both say Python 3.12, and that's because on a bunch

06:58.880 --> 07:04.520
of noble, we have Python 3.12, but on a bunch of jammy, we have Python 3.10, and so

07:04.520 --> 07:10.600
this wheel is not compatible with the bunch of jammy in its default state.

07:10.600 --> 07:12.880
So that's the first kind of complication you're going to meet.

07:12.880 --> 07:16.480
When you build a Python wheel for distribution, you don't just build one wheel, you have

07:16.480 --> 07:18.040
to build multiple wheels.

07:18.040 --> 07:21.840
One, at least for each version of Python, your customers are going to use, and it's

07:21.840 --> 07:25.120
actually a lot more complicated than that, as we should see in a minute.

07:25.120 --> 07:32.120
So we can try and get around this problem by, well actually I anticipated this problem,

07:32.120 --> 07:37.760
and I pre-installed Python 3.12 on jammy, so we should be able to create another virtual

07:37.760 --> 07:47.480
environment that uses Python 3.12, let me delete the old one, let's create, and then

07:47.480 --> 07:52.480
we should be able to install it.

07:52.480 --> 07:59.920
Okay, let's try that, and this time it works, let's see does it actually work?

07:59.920 --> 08:07.280
You can see it says Python 3.12 now, we're going to try and import it, and we can

08:07.280 --> 08:11.520
import it, it fails, and it fails with another obscure error message.

08:11.520 --> 08:14.840
And if you read the error message, at least it's not too pages long, this time it's complaining

08:14.840 --> 08:16.200
about G-lipsy.

08:16.200 --> 08:20.360
And the problem here is that sentence piece, we compiled it on noble, and the noble has

08:20.360 --> 08:27.880
G-lipsy 2.39, and jammy only has G-lipsy 2.35, and when we compiled this on noble, it

08:27.880 --> 08:32.200
took advantage of symbol that was introduced in G-lipsy 2.38, and that symbol obviously

08:32.200 --> 08:35.640
isn't present on jammy, and so the package won't work.

08:35.640 --> 08:40.640
So our goal of trying to compile on the latest distribution to get the latest version of G-c

08:40.640 --> 08:47.400
sort of failed, but ultimately that's really what you want to do, but at the same time as

08:47.400 --> 08:52.720
we just see, you also need to compile on an old distribution with an old G-lipsy, so your

08:52.720 --> 08:56.800
wheel that you create will work on a wide range of distributions, and that's a sort

08:56.800 --> 09:01.440
of a contradiction that we shall return to later on to see how it solved up stream.

09:01.440 --> 09:05.920
And for now what I'm going to do is I'm just going to build sentence piece directly on jammy

09:05.920 --> 09:09.680
and try and install it in noble, just to prove that that is the problem.

09:09.680 --> 09:14.400
So again, this is going to take a few seconds to build, and while it's building, I am

09:14.400 --> 09:16.960
going to set up my virtual environment.

09:16.960 --> 09:21.960
I'll just build it wrong.

09:21.960 --> 09:39.360
Okay, it is built, let me delete the old wheel as here, and build copy in our new wheel.

09:39.360 --> 09:47.400
Let's try and install it.

09:47.400 --> 09:59.120
And it works, so we built on jammy, we can take that wheel and install it on noble,

09:59.120 --> 10:00.120
that's good.

10:00.120 --> 10:04.520
So the second thing you need to understand is when building wheels, you need to try and build

10:04.520 --> 10:09.520
on an old distribution as possible against the old G-lipsy possible.

10:09.520 --> 10:13.680
But sentence piece is actually a pretty simple package, it only contains, it's not, it doesn't

10:13.680 --> 10:16.600
contain a huge amount of code, and it doesn't have any dependencies.

10:16.600 --> 10:19.360
So let's try building something a bit more complicated.

10:19.360 --> 10:22.480
We're going to try and build numpy.

10:22.480 --> 10:25.360
Here's the script I'm using to build numpy, this is a pretty much how numpy is built up

10:25.360 --> 10:26.360
stream.

10:26.360 --> 10:32.160
Again, I'm building it inside a virtual environment with all the dependencies that are required.

10:32.160 --> 10:38.360
Okay, yeah, this is actually going to take about 50 seconds on this VM.

10:38.360 --> 10:43.800
So while it's building, allow me to just demonstrate that sentence piece actually does work,

10:43.800 --> 10:55.480
I have a small test here for sentence piece, so let's just run that.

10:55.480 --> 10:59.680
This is just tokenizing a sentence using sentence piece, it says hello false demo, and so

10:59.680 --> 11:03.640
we've tokenized this, and you can see that it is working fine.

11:03.640 --> 11:06.080
So this should be ready in a minute.

11:06.080 --> 11:09.760
We are building on jammy, because we've learned our lesson that we want to build using

11:09.760 --> 11:17.800
an old version of you, and we're going to try and install it on noble when it's finished.

11:17.800 --> 11:24.040
Okay, it's finished, so let's copy the package that we built into our folder, and let's

11:24.040 --> 11:32.320
try and pick installing it.

11:32.320 --> 11:35.760
It has installed, but does it work?

11:35.920 --> 11:38.920
Anybody guess?

11:38.920 --> 11:43.080
No, it doesn't work, and we get another huge page of error message.

11:43.080 --> 11:46.640
If you read through the error message, you eventually will come to this line here where it says

11:46.640 --> 11:51.480
the original error was, and it's complaining about a missing shared library, and so the problem

11:51.480 --> 11:57.480
here is that numpy, like a lot of other Python packages, have dependencies on shared libraries

11:57.480 --> 12:02.920
in addition to the G-lip C, and that shared library, open blasers not installed, locally

12:02.920 --> 12:08.960
on this noble VM, so when I try to use numpy, try to import numpy, can't find the library

12:08.960 --> 12:10.760
and it fails.

12:10.760 --> 12:16.880
Now what you could do is you could just try installing open blasers on this noble VM, and

12:16.880 --> 12:21.960
then re-importing numpy, and it will probably work, but it might work exactly correct.

12:21.960 --> 12:28.840
And the reason for this is when upstream packages like numpy use dependencies, they often

12:28.880 --> 12:33.760
rely on a very specific version of those dependencies, and they often expect them to be built

12:33.760 --> 12:36.560
in a very specific way, and sometimes to be patched.

12:36.560 --> 12:40.600
So if you just use the distro version of numpy, you're not necessarily going to get the same

12:40.600 --> 12:44.800
behavior as people who download numpy from pi pi r.

12:44.800 --> 12:55.240
So the way that this is achieved upstream is that when numpy builds their packages, they use

12:55.280 --> 13:02.240
a tool called audit wheel, and what audit wheel does is that it analyzes all of the binary

13:02.240 --> 13:07.920
components of a wheel, and it finds all of their dependencies, and it checks those dependencies

13:07.920 --> 13:11.880
to see if they're all a wireless, and if they're not on a wireless, it copies those dependencies

13:11.880 --> 13:16.160
directly into the wheel, and so when the wheel is distributed on pi pi, the wheel doesn't

13:16.160 --> 13:21.560
just contain numpy, but it also contains all of numpy binary dependencies, so things

13:21.560 --> 13:25.560
like open-blas and lib g4 from there package inside the wheel.

13:25.560 --> 13:31.560
And so we can just look at that working here, so what I'm going to do is I'm going to use

13:31.560 --> 13:39.080
audit wheel to repair my numpy wheel with this little script here, and it's written

13:39.080 --> 13:42.360
the wheel to a different directory, so let's have a look.

13:42.360 --> 13:47.560
You can see also that the name of the wheel has changed previously, the tank here was Linux,

13:47.560 --> 13:50.480
and now it says many Linux 2.35.

13:50.480 --> 13:55.040
And what that means, this many Linux 2.35 is a sort of a guarantee that this particular

13:55.040 --> 14:00.320
wheel will work on any Linux distribution that has a G-Libc 2.35 of greater, and the way

14:00.320 --> 14:09.320
that's achieved is that it's linked against G-Libc 2.35, any dependencies that it uses

14:09.320 --> 14:14.000
that aren't on this white list are copied into the wheel itself, and any dependencies

14:14.000 --> 14:18.320
that are on the white list, it makes sure that the package doesn't use any symbols

14:18.320 --> 14:24.240
and those dependencies that aren't present in a Linux distribution that ships G-Libc 2.35.

14:24.240 --> 14:38.000
So let us copy our new wheel to our shared folder, and yeah, oh, let's install the

14:38.000 --> 15:03.720
version of them by that list installed E-version, and it's working, important numpy numpy

15:03.720 --> 15:09.280
one, and it works, okay.

15:09.280 --> 15:13.680
So that is the third thing you have to consider, when you're distributing wheels, you

15:13.680 --> 15:19.160
need to repair them to make sure that they contain all the dependencies packaged directly

15:19.160 --> 15:23.880
within the wheel to guarantee that they're going to work on your end user's machines.

15:23.880 --> 15:28.200
Okay, so that's the end of the demo, wasn't too bad.

15:28.200 --> 15:33.440
Let's return to the presentation, can I do that?

15:33.440 --> 15:35.600
And we'll just recap what we just learned.

15:35.600 --> 15:39.560
So when you're distributing binary wheels in Python, you need to build multiple wheels.

15:39.560 --> 15:43.400
You need to build one wheel for each version of Python, your customer is going to use.

15:43.400 --> 15:48.240
But actually, if you want to support multiple versions of G-Libc and multiple versions

15:48.240 --> 15:52.880
of multiple types of C standard library, you want to support G-Libc and Muzzle Linux, then

15:52.880 --> 15:57.440
Muzzle, you need to build another set of wheels, if you want to support Mac OS and Windows,

15:57.440 --> 15:59.040
you need to build another set of wheels.

15:59.120 --> 16:04.640
If you want to support X86, AR64, RSI, PowerPC, you need to build another set of wheels.

16:04.640 --> 16:09.360
And if you multiply all these together, you can end up having to build like 30 or 40 or 50 wheels.

16:09.360 --> 16:15.840
I looked on the PIPI repository for NumPy, and they actually build 54 different wheels for

16:15.840 --> 16:17.440
a friend NumPy.

16:17.440 --> 16:20.280
And also, there's different Python interpreters, right?

16:20.280 --> 16:24.480
So there's a PIPI and that needs its own own special wheel.

16:24.480 --> 16:27.920
So that's the first tip of the building Python packages industry.

16:27.920 --> 16:31.200
You need to build lots and lots of different wheels.

16:31.200 --> 16:35.680
The second issue that we have come across is ideally you want to build with a brand new G-Libc.

16:35.680 --> 16:41.200
Sorry, G-C-C, and you can imagine that's important for RSI 5, or we want to build with G-C-C 14,

16:41.200 --> 16:42.560
so we get a vector support.

16:42.560 --> 16:47.360
But you also want to build with an old G-Libc, so you can widely distribute your wheels.

16:47.360 --> 16:50.480
And finally, we have this issue of dependencies.

16:50.480 --> 16:54.000
The issue of dependencies is actually a lot trickier than I've made it appear.

16:54.000 --> 16:57.520
It's not just a matter of running order wheel, and the reason it's not just a matter of running

16:57.520 --> 17:02.000
order wheel, is when you're distributing your package, you have to find out what

17:02.000 --> 17:06.080
version of the dependency that package wants, how it wants it built, how it wants it patched,

17:07.120 --> 17:10.160
and then you have to build it, and then you can repair your wheel.

17:10.160 --> 17:14.720
And often building the dependencies for a binary Python package is much more difficult

17:14.720 --> 17:16.080
than building the packages itself.

17:16.160 --> 17:24.160
OK, so this is all pretty tricky, and this is tricky on all platforms and all architectures.

17:24.160 --> 17:30.400
And so the sort of open source community has come up with a lot of projects to make this

17:30.400 --> 17:35.680
easier, to make the sort of the creation and distribution of Python packages easier.

17:35.680 --> 17:38.880
And so we talked a little bit about distribution before, this is a central repository

17:39.680 --> 17:41.040
for distributing Python packages.

17:41.040 --> 17:41.920
This is called Python.

17:42.400 --> 17:48.000
And that's when you just do pip install numpy, that's where people go through by default to find

17:48.000 --> 17:49.040
its packages.

17:49.040 --> 17:56.720
This problem of solving, of wanting to build with new GCC and old G libsy is sold by a project called

17:56.720 --> 18:01.360
many Linux. So many Linux is a project that publishes container images which are

18:01.360 --> 18:04.720
specifically designed to build Python packages.

18:04.720 --> 18:11.120
They ship with an oldest G libsy, so the newest many Linux container images many Linux 2.3.4,

18:11.200 --> 18:16.800
which ships with G libsy 2.3.4, but it takes advantage of a red hat project called GCC Toolset,

18:16.800 --> 18:21.360
which allows you to install the latest version of GCC without updating your G libsy.

18:22.160 --> 18:28.960
And so many Linux 2.3.4 has G libsy 2.3.4, but it also has GCC 14, which is a really nice combination.

18:28.960 --> 18:33.200
The mellin Linux images also contain the latest versions of the build tools,

18:33.200 --> 18:36.960
stuff like Git and C makes, so you don't have an old version of these tools, which is good,

18:37.520 --> 18:42.480
that might prevent you building things. And they also come with about 10 different pre-built versions

18:42.480 --> 18:46.080
of Python. And that's important because when you're building multiple wheels, you need separate

18:46.080 --> 18:49.280
versions of Python for each different wheel you want to build.

18:50.960 --> 18:55.280
We've talked a little bit about binary dependencies, and we also looked at audit wheel,

18:55.280 --> 18:59.280
so you need to sort of vendor these binary dependencies into your wheel, and that is done by audit

18:59.280 --> 19:03.840
wheel, and it can also be done by a tool called mature and matureness, tools designed to build

19:03.920 --> 19:09.440
hybrid Python rust packages, and it has this repair facility.

19:10.880 --> 19:15.200
We also talked about how you might have to build about 50 of these different wheels,

19:15.200 --> 19:19.120
and for each of these wheels you have to build, you'll have to identify the correct Docker

19:19.120 --> 19:22.720
image, the many Linux image to build them in. You've actually got to build a thing,

19:22.720 --> 19:26.240
you've got to repair the wheel, and then you've got to run all your tests, and you've got to do it

19:26.240 --> 19:30.720
50 times. So to prevent people from having to do this in every single upstream project, there's a

19:30.800 --> 19:34.080
program called CI Build Wheel, which automates this process for you.

19:36.160 --> 19:41.120
Once the wheel is built, you're going to need to test it in your CI, so you're going to need,

19:41.120 --> 19:44.960
and this is, yeah, you're not going to want to do this locally, you're going to want to do this in CI,

19:44.960 --> 19:50.640
so you're going to need some runners to do this. You're going to run a lot of tests to make

19:50.640 --> 19:56.560
sure you will work, and then finally once it's built and tested, you can upload it to the

19:56.560 --> 20:00.320
package registry, and there's a bunch of tools you can use to upload and install here.

20:00.320 --> 20:04.880
So trying a mature end can be used to upload packages to registries like pipeline, and pip,

20:04.880 --> 20:12.640
and UV can be used to install again packages. Okay, so to summarize, building Python packages

20:12.640 --> 20:21.520
is difficult, but on the more popular architectures like ARM and RSSX86, there's a whole pile of

20:21.520 --> 20:26.560
packages that are a whole pile of projects that are aimed to make this make the problem easier.

20:27.360 --> 20:32.240
But if you look at RSSX64, you will see that most of these packages are most of these projects

20:32.240 --> 20:40.240
don't actually support RSSX64. The only ones that do are audit wheel and matureren, and the upload

20:40.240 --> 20:49.680
packages, so trying mature and pip and UV all support RSSX64. So the problem is, on RSSX64,

20:49.680 --> 20:53.840
not only do we have this really difficult problem to solve, but a lot of the tools that are

20:53.920 --> 21:01.280
normally used to solve these problems don't currently support RSSX64. Okay, so but don't get too

21:01.280 --> 21:08.160
disheartened. If we were to have looked at this slide this time last year, there would be

21:08.160 --> 21:13.280
everything would be read in the right hand column. So progress has been made in the last year

21:13.280 --> 21:18.560
between getting RSSX64 support into some of these projects, but it's still not there, and so for the

21:18.560 --> 21:23.840
time being the problem still exists on RSSX64, and it's a big problem, because it's difficult

21:23.840 --> 21:30.880
to install these very useful Python packages. So recognizing this, RSSX64, which is an industry

21:30.880 --> 21:37.200
consortium, designed to improve the RSSX64 system, has created a project, an open source project

21:37.200 --> 21:45.120
called WillBuilder. WillBuilder, and the WillBuilder project is, well, what it does is it builds

21:45.280 --> 21:52.720
and distributes a small set of binary Python packages that we think will be useful.

21:54.080 --> 22:00.080
We're currently building about 30 packages, including things like NumPy, CyPy, Paners, MapLotLive,

22:00.080 --> 22:05.120
maturen, and we're planning to build more packages in the future. Not only do we build them,

22:05.120 --> 22:10.320
but we also make sure that they're kept up to date. So, you know, we don't just build like NumPy,

22:11.280 --> 22:16.000
1.26.3, and then leave it there for two years. You know, we've been building newer versions

22:16.000 --> 22:25.040
of NumPy as well. They all the things we're building are many Linux 2.3.5, RSSX64 images, and they're

22:25.040 --> 22:31.200
built with RV64GCs, so they should run on pretty much any device. And we go to great lengths to

22:31.200 --> 22:37.280
build these wheels in the same way that they build upstream. So they, the behavior of the wheels we

22:37.280 --> 22:41.840
build on wheelbuilder should match the behavior of the wheels you download for other architectures

22:41.840 --> 22:47.920
on Python. So, for example, if we go back to the NumPy dependency problem, when we build and package

22:47.920 --> 22:53.360
open-blasin side-iron NumPy wheels, we make sure we're building and packaging the exact same

22:53.360 --> 22:59.520
version of open-blasin upstream NumPy does. So, you should see the same behavior. And the wheels

22:59.680 --> 23:06.320
are tested, mostly, I say here, because sometimes it's not possible to run all the tests.

23:06.320 --> 23:11.040
It can be because there's a risk-wise specific bug, but generally it's because the test will

23:11.040 --> 23:16.560
require a dependency that doesn't exist on RSSX64. And so in those cases, we disable the tests.

23:16.560 --> 23:21.840
But in most cases, the wheels are tested in the CI before they're uploaded as well as we

23:21.840 --> 23:26.560
can test them by running the projects normal, normal tests. So, when you download wheels from

23:26.640 --> 23:33.120
wheelbuilder, the wheels should work. So, you're wondering how do I download the wheels from

23:33.120 --> 23:39.120
wheelbuilder? Well, it's pretty simple. The way this actually works is that the wheelbuilder

23:39.120 --> 23:44.320
project is in GitLab and GitLab actually allows projects to create their own Python package

23:44.320 --> 23:49.120
registries. So, we build the wheels in GitLab and we upload them to the package registry associated

23:49.120 --> 23:55.840
with that project. It has a quite a catcher URL here. But to install the packages, all you need to

23:56.400 --> 24:04.880
do is upgrade PIP. You need PIP 2.24.1 or greater for RSSX support. So, if you just have an earlier

24:04.880 --> 24:10.080
version, you'll have to upgrade PIP for us. Then you need to tell PIP to use the GitLab registry

24:10.080 --> 24:14.240
instead of Python. And you do that by setting this environment variable, or you can actually

24:14.240 --> 24:19.920
specify this on the command line using the minus minus index URL option. And then you just install the

24:19.920 --> 24:24.240
package. And so, I've got a little video here just to demonstrate this working.

24:25.840 --> 24:37.600
So, as you can see, I'm back on a wrist 5vm. I only have PIP 24.0 here. So, I'm going to upgrade PIP

24:37.600 --> 24:41.440
first. Otherwise, I will get an error when I try and install packages from wheelbuilder.

24:41.520 --> 24:56.400
So, the first thing we're doing, we're updating PIP. Now we're going to install, now we're

24:56.400 --> 25:06.160
going to export our index URL. So, PIP knows where to find the packages. We're going to

25:06.160 --> 25:10.400
try and install sendersbeaks. Now, remember, this failed spectacularly right at the start of the

25:10.400 --> 25:16.160
talk. But, now we specified the right index URL. It downloads the package and installed it,

25:16.160 --> 25:21.920
and it installs it really quickly. Let's try and install NumPy. Just to give you another example.

25:22.960 --> 25:26.640
And NumPy is a bit of a bigger package. It's got all those dependencies. So, it takes a little longer

25:26.720 --> 25:41.920
to install, but it should be installed pretty soon. Wow, that's slow. Yes, it's installed.

25:41.920 --> 25:45.520
And now let's just start Python. An important NumPy just to demonstrate it works.

25:45.840 --> 25:57.200
And indeed, it does. Now, there's just one more thing that I want to show you. I'm going to

25:57.200 --> 26:01.040
try and install another package. This time, I'm going to install something called twine. We mentioned

26:01.040 --> 26:09.760
twine earlier on. Twine is a package for uploading tenants. Oh, it's a package for uploading

26:09.760 --> 26:16.640
other wheels to package registry. So, the reason twine is interesting is because it's a

26:16.640 --> 26:22.320
pure Python package, but it has dependencies that are binary dependencies. And so, I'm able to

26:22.320 --> 26:29.680
install twine here. Even though I've told Pip to go and look at the wheel builder, get lab

26:29.680 --> 26:36.880
package registry, and not to go to pipeline. And we don't have twine in the wheel builder project,

26:36.880 --> 26:41.760
because it's a pure Python package. There's no point in this building it. And yet, this still works.

26:41.760 --> 26:46.880
And the reason it works is that when you send a request to get lab, the package will get lab

26:46.880 --> 26:51.360
padded red package registry. It will first check to see if it knows anything about that package.

26:51.360 --> 26:55.520
And if it doesn't have any wheels for that package, and it doesn't know anything about that package,

26:55.520 --> 27:01.120
it forwards the request to pipeline itself. And so, this allows us to just use one index

27:01.120 --> 27:09.280
URL to install a mixture of pure Python binary Python packages, which is quite nice.

27:10.320 --> 27:16.960
So, that is pretty much it for the talk. What I might just do is pop up the wheel builder project,

27:16.960 --> 27:23.760
so you guys can see it. So, it's here on GitLab. I guess I have the link in the talk at the end,

27:23.760 --> 27:29.520
so let me just show you quickly the project. Here's the list of the packages we're building,

27:29.600 --> 27:33.040
and you can see we're building multiple versions of some packages, so the CMake,

27:35.040 --> 27:43.440
markup save, map lotlib, OB tree, lots of different packages, pandas, tornado, TL parts,

27:43.440 --> 27:49.360
and you can just download those using that link and install them directly. And let me see, yes.

27:51.440 --> 27:53.680
Yep, and that is the end of the talk.

27:59.920 --> 28:04.960
Right, where a couple of minutes for questions, then it takes.

28:15.840 --> 28:21.840
Thank you, great project. What would it take to adapt the wheel builder for other architectures,

28:21.840 --> 28:26.000
like long arch? Sorry, could you repeat the question please?

28:26.000 --> 28:30.000
Like, what would it take to adapt the wheel builder for other architectures, like long arch?

28:30.800 --> 28:42.320
Oh, I see. Ah, yes, that's an excellent question, and you reminded me of some two things I meant

28:44.080 --> 28:49.600
that I didn't say. So, the way we're building these wheels. I remember when I was talking earlier on about

28:50.320 --> 28:55.600
how all these infrastructure projects are needed to build wheels for Python and how a lot of those

28:55.600 --> 29:00.320
are missing for risk by 64. And the two important ones were many Linux and CI build wheel.

29:01.360 --> 29:04.640
Because they don't exist for risk by 64, we have had to patch them ourselves.

29:05.440 --> 29:09.600
So we have created our own many Linux container based on the Ubuntu 20204

29:10.480 --> 29:14.560
and our own patch version of CI build wheel. And we use those internally to build the wheels.

29:14.560 --> 29:19.040
So to do this for another architecture that is not supported by many Linux or CI build wheel,

29:19.040 --> 29:22.720
you would need to do the same thing. And you would also need to make sure audit wheel

29:23.120 --> 29:28.480
supports long arch. And I'm not sure it does. I would have to check the policy files for audit wheel.

29:28.480 --> 29:34.240
And that is the tool you need to repair the wheels. So, if audit wheel doesn't do that,

29:34.240 --> 29:40.080
then you would need a patch version of audit wheel as well. So it's definitely doable,

29:40.080 --> 29:43.520
but you would need to patch these projects that we have to patch ourselves.

29:48.880 --> 29:49.360
Anymore?

29:53.200 --> 29:58.960
Is that going to mean to be many dials? Or let's say those are 13 per unit, or the

29:58.960 --> 30:03.440
twice sunshine, or something like this, which means that we are going to be many off-back edges.

30:04.400 --> 30:11.360
So that's a very good question. At the moment, we're building everything with RV64GC, okay.

30:14.240 --> 30:21.520
It is possible, it should be possible now to build NumPy. If we were to build it with GCC14,

30:21.520 --> 30:24.960
so we'd need to create a new many-lander container, or two-to-three, nine container,

30:24.960 --> 30:30.800
we should be able to build NumPy, that has some vector support in it, right? And that vector

30:30.800 --> 30:35.440
support would come through open blasts. And because open blasts does runtime detection of lots

30:35.440 --> 30:41.680
extensions are available, that will work on all part forms. So it will work on a, you know,

30:41.680 --> 30:46.240
something that doesn't have vector and something that does. But more generally speaking, there is no

30:46.240 --> 30:50.320
real way in Python at the moment, and this isn't just a risk by problem. There's no way to

30:50.320 --> 30:57.680
tank your wheels with information about what extensions those wheels expect. And this is a problem

30:57.680 --> 31:04.160
across all the architect, because it hasn't been noticed until very recently, because the x86 wheels

31:04.160 --> 31:11.040
have been built on distros that targets x86, v1. But that has just changed recently with album

31:11.120 --> 31:20.240
in x2.35, and now targeting AMD v2. And so some wheels built with many legs 2.34 have been

31:20.240 --> 31:26.720
crashing on really, really old machines, and there's no way really at the moment to label those

31:27.920 --> 31:33.600
labels those wheels. And so, yes, so that is sort of an open issue in the Python sort of packaging

31:33.600 --> 31:37.840
ecosystem, and it doesn't just apply to respect it applies to all the architectures, but it's

31:37.920 --> 31:41.920
really good that it's just come up in x86 because they're going to have to try and solve it,

31:41.920 --> 31:48.240
and we can get in on the act. I should also mention, actually, sorry, there's one other thing I

31:48.240 --> 31:52.800
want to mention, we don't want to do this forever. So our goal is just to build these wheels until

31:52.800 --> 31:57.600
such a time as all those infrastructure projects and start supporting risk five, and all the

31:57.600 --> 32:02.160
upstream projects start uploading risk five wheels to wheelbuilder. And when that happens, we'll

32:02.240 --> 32:06.240
start to wind the project down. So it's not something we're going to do forever.

32:14.160 --> 32:19.760
All right, thank you more.

