WEBVTT

00:00.000 --> 00:17.520
So, hi everybody, welcome to this last session of this conference basically for us here

00:17.520 --> 00:24.080
in this room. I'm Cassini Vinesky. I'm a research software specialist or expert at

00:24.080 --> 00:30.160
Tudam Stutt and a researcher at HPC Specialist and today I'm presenting work that is a

00:30.160 --> 00:36.560
collaborated effort of many. In particular, this talk was submitted by people from the Fritric

00:36.560 --> 00:42.400
Alexander University and Tudam Stutt. So, that's me. And what I'm going to talk about today?

00:42.400 --> 00:47.920
Well, first I'm going to introduce basically the basic question of why is

00:48.560 --> 00:53.680
job specific monitoring something interesting? And I'm going to expand on the tools that

00:53.680 --> 01:01.200
were developed by basically the project partners, cluster cockpit and puzzle jobs. And I'm going

01:01.200 --> 01:09.360
to talk a little bit about the use experience of these systems. So, basically who am I? I'm from Tudam Stutt

01:09.360 --> 01:15.520
Tudam Stutt. Tudam Stutt is one of the nine NHR centers in Germany, which are like not the top tier,

01:15.520 --> 01:22.160
but also not the medium and lower tiers. It's like big HPC systems. In the order of magnitude,

01:22.160 --> 01:27.840
each system has a thousand nodes. Each of the nodes has a hundred cores. So, we're talking about

01:27.840 --> 01:34.960
hundred thousand cores in total. Each on average has four gigabytes of memory. And we have about

01:34.960 --> 01:43.520
500 to a thousand active users give a take. And the question is, classical monitoring by

01:44.160 --> 01:50.640
admins is on the system level. We see what each node is doing, whether or not the node is healthy.

01:51.920 --> 02:00.000
We see whether or not the hardware's operational. We see whether we're getting the performance

02:00.000 --> 02:05.920
that we paid for, but we also would like to see what are the users doing. Are the users actually

02:05.920 --> 02:13.200
making any meaningful use of the resources they occupy? Are they occupying all of the resources

02:13.200 --> 02:18.640
or are they just allocating the resources and leaving them empty? Would like to know,

02:20.400 --> 02:25.280
what is the limiting factor of those jobs that are running in our system? Is the memory

02:25.280 --> 02:29.360
bandwidth the limiting factor? Is the network hard the limiting factor? So, we can do better

02:29.360 --> 02:35.040
for humans in the future. And then, in the end, we still also would like to know, how is the

02:35.040 --> 02:40.240
performance of those particular jobs doing? Are they running at peak efficiency? Are they doing

02:40.320 --> 02:45.040
the best they are doing? What is the software actually doing with the resources on the system?

02:45.040 --> 02:52.560
Like, hard to perform scandalous and so forth. And why is the software behaving as it is as we can

02:52.560 --> 03:00.560
observe it? And, classically, what we have is we have stuff like system monitoring here where the

03:00.560 --> 03:05.920
admins have graphana, they have big visualization boards where they can see all nodes are green,

03:05.920 --> 03:12.480
some nodes are yellow, some nodes are blue, what does that tell them? It tells them stuff about nodes.

03:12.480 --> 03:17.760
Well, what we could do, we could go in and say, okay, who was on that note? Hopefully, it was an

03:17.760 --> 03:22.960
exclusive job and nothing else was running. And then we can somewhat infer what that job was doing.

03:23.760 --> 03:27.920
That's somewhat in the direction of application monitoring. And then there's something in the

03:27.920 --> 03:33.840
bottom, what we HPC specialist do, we do performance analysis. We instrument, we profile, we modify

03:33.920 --> 03:38.960
the codes of the software to get as much information out of them so we can be enlightened of what

03:38.960 --> 03:46.880
is actually happening. But this middle region here is somewhat unsupported. Well, we can kind of

03:46.880 --> 03:53.360
infer stuff, we can mock up certain tools to bring us that information, but there wasn't anything

03:53.360 --> 03:59.040
on that particular level at the beginning when this, basically, when my colleagues and I started

03:59.040 --> 04:04.400
working on this problem. So what we came up is something, mostly the people in Alan and I

04:04.400 --> 04:10.720
have to admit came up is something called cluster cockpit. And cluster cockpit itself is the ideas

04:10.720 --> 04:17.040
we are not tracking the whole cluster. We are tracking on the granularity of jobs. What does this

04:17.040 --> 04:23.040
then look like? Okay, cluster cockpit provides everyone who wants it, users, happens stuff,

04:23.120 --> 04:28.800
supporters with a graphical user interface where they can see what is running on the cluster.

04:28.800 --> 04:34.960
So you can scroll down and for each job that is running on the cluster, we get one of those lines,

04:34.960 --> 04:41.200
each of those lines then consists of configurable different metrics. Like for example, the first one here

04:41.200 --> 04:49.360
is the UCPU utilization, then we have memory here, then we have any, it's a flop's metrics,

04:49.360 --> 04:55.600
so how many flops are the nodes in total doing and how much memory bandwidth is consumed on those

04:55.600 --> 05:01.760
particular systems. So when looking at that overview, we can get a brief glimpse of what is the

05:01.760 --> 05:07.520
job doing, is it using the resources, is there anything a miss? We get additional information here

05:07.520 --> 05:15.040
on in front, we get the username and not shown here, the username. So we can actually contact them

05:15.040 --> 05:20.320
in case we want to support them in some way or another. We can get some survey information,

05:20.320 --> 05:24.640
how many cores do they have, how many nodes do they have, when did it start, how long is the

05:24.640 --> 05:29.600
job running, is it still running, or is it a job from the past that I can potentially

05:29.600 --> 05:34.320
want to investigate, we can filter with different metrics like tags, how long did it run,

05:34.320 --> 05:40.080
so very comfortable using the face. You can also click on any of those jobs and get even more

05:40.080 --> 05:45.200
detailed information like, for example, again, all of those metrics, you can have those metrics,

05:45.200 --> 05:49.440
you can scroll in on the metrics, you can select whether or not you want to have the metric shown

05:49.440 --> 05:54.560
at the node level, at the core level, at the socket level, so very is going to learn it is,

05:54.560 --> 06:02.240
you can interactively browse your CPU, your cluster workload, but not on a node level, but on a job

06:02.240 --> 06:08.160
level. And in particular, something like these graphs here, maybe not as good visible here on the

06:08.160 --> 06:12.640
projector, where you have this kind of spider diagram where you can see, okay, how much of the

06:12.640 --> 06:18.240
memory is used on the system, how many flops are consumed from the computing, and how much

06:18.240 --> 06:23.520
memory by this consumed. And ideally, we would like to have a full triangle here because the job

06:23.520 --> 06:29.120
is burning, turning out the maximum floating point operations, using all of the bandwidth and using

06:29.120 --> 06:34.080
all of the memory, and if you want it probably also all of the cores, a full system, great,

06:35.040 --> 06:39.840
not that job isn't doing that. So we probably wouldn't like to investigate, and there's other

06:39.840 --> 06:49.040
metrics that you can support like, for example, over here, this classical graph, the roof line graph

06:49.040 --> 06:54.960
that helps you assess whether or not your job is within boundaries. There are many other views,

06:54.960 --> 06:59.440
and I briefly want to just go with it because I'm not going to train you how to use faster

06:59.440 --> 07:03.680
cockpit here, it's just gave you a teaser of what the whole system that we're capable of doing.

07:03.680 --> 07:08.160
So you can also kind of open up for your user and then say, okay, how many jobs does that

07:08.160 --> 07:12.560
particular user have? What is the total of all time? And then you can configure also how much of

07:12.560 --> 07:18.880
these resources you want to see, do you want to see how many flops you used, whatever makes you happy.

07:21.920 --> 07:27.120
We go back briefly to the overview, we can also investigate what is the total use of a

07:27.120 --> 07:31.600
sub cluster. Okay, this is something a little bit more in the direction of administrative stuff.

07:31.600 --> 07:36.640
How many nodes are actually allocated? What is the current flops rate in that cluster?

07:36.640 --> 07:40.080
Oh, what is the current memory bandwidth rate in the cluster? And this again,

07:40.080 --> 07:45.680
can give us a system maintainers and maybe even a procurement people, an idea of where we want to

07:45.680 --> 07:50.320
invest our money in the future because if we have a system and all our users are hardly making

07:50.320 --> 07:54.960
use of the floating-point numbers and hardly using the memory bandwidth that the system is capable

07:55.040 --> 07:59.840
of delivering, why are we buying that expensive hardware? But again, this is entirely different

07:59.840 --> 08:05.840
kind of worms and this is an example, this is not actual real data.

08:06.960 --> 08:13.680
Might be, might not be, I don't want to tell. Again, you can also have different views like, for example,

08:13.680 --> 08:20.480
who are the top users on certain clusters? What is the ratio? What is the average duration of the

08:20.480 --> 08:25.840
jobs or how many nodes does that particular job have? So it gives you different views of what

08:25.840 --> 08:34.080
the cluster is doing. Okay, how does it do that? Well, it all starts with an entity that's called

08:34.080 --> 08:38.960
a metric store that is the shorter memory of the whole system. Usually you install that on a one

08:38.960 --> 08:46.480
particular node, it needs a lot of memory because it keeps all of the data you obtain on the cluster

08:47.360 --> 08:54.080
for the system to display you on a short term notice. Then, on each node, you install and start

08:54.080 --> 09:00.720
a metric collector. That's a little, I would call it service demon, that starts on a configurable

09:00.720 --> 09:06.160
time level, like for example, every minute, every five minute, every ten minutes and it collects

09:06.160 --> 09:11.440
whatever metrics you want it to use. In particular, what we are using is liquid to gather

09:11.440 --> 09:16.640
half of the performance counters for the system. We use CPUs that to get, okay, how much is

09:16.640 --> 09:21.280
what is the load on the system? We use Mems that to get information from memory. We use Ibse that

09:21.280 --> 09:25.760
to get information from the infinite bank card, but there's like, I think last time I counted

09:25.760 --> 09:31.920
it was about 40 different interfaces that you could use to get information, even for GPUs,

09:31.920 --> 09:38.480
or how full is the GPU, what kind of GPU, tons of data. All of that is sent to the metric store.

09:38.480 --> 09:44.160
So you have a one, a menu to one, or a menu to two communication, and the metric store keeps

09:44.160 --> 09:50.480
up that. That is a JSON communication. Then, you have the backend that is responsible for showing

09:50.480 --> 09:57.600
you the nice user interface, this web-based user interface, and that queries this metric store

09:57.600 --> 10:03.680
for whatever it needs for to display the view. But since you're intactively working with the

10:03.680 --> 10:10.800
system that needs to be quick, so it's not on storage. What happens then, once the job has

10:10.800 --> 10:16.320
passed in a certain point in time, usually at the end when the job terminates, the backend

10:16.320 --> 10:21.680
system takes all of the data that pertains to a particular job and dumps all of that into a

10:21.680 --> 10:28.400
job archive. The job archive is then the permanent record of what the job was like at this one

10:28.400 --> 10:33.360
time. The metrics are resolved in time, so we have time series of all of the data and you can

10:33.360 --> 10:40.160
get that. How do you get that information? Well, that is the one thing that isn't by default enabled

10:40.160 --> 10:45.840
by the distribution. You have to write a little script that whatever batch system you're using

10:45.840 --> 10:50.560
at the start and at the end of the batch job, you need to communicate, hey, my job ran on node 1,

10:50.560 --> 10:56.960
2, and 3, and it had socket 1, 2, and 3, so all of the data needs to be communicated to the

10:56.960 --> 11:01.520
backend system, so the backend system knows what kind of data to pull and put in the job archive.

11:02.480 --> 11:09.600
And in this case, there is a slurm plugin, or not plugin, but slurm can drop, that checks

11:09.600 --> 11:14.560
every five minutes, okay, which jobs are there, which jobs aren't there anymore, and communicates

11:14.560 --> 11:20.400
all of that to the backend system. And then in the end, the user can log onto the backend system

11:20.400 --> 11:27.040
and whatever it wants to see, it can see. You're optionally also can couple all of that with

11:27.120 --> 11:33.680
data bases, time series data bases, SQL data bases, for whatever else you want to use it for.

11:35.920 --> 11:41.120
And this brings me not to the second point. Since this is now a very complicated system with

11:41.120 --> 11:46.560
lots of displays and lots of data, nobody actually wants to go through each job and assess whether

11:46.560 --> 11:52.880
or not that particular job was a good job. And for that reason, the colleagues and I, we haven't

11:53.200 --> 11:59.280
said, okay, can we automate this kind of checking by having a rule system that notifies us when

11:59.280 --> 12:05.520
certain boundaries or certain thresholds are violated? Basically, some form of automated alert,

12:05.520 --> 12:10.480
okay, I use a here or there is doing a very bad job of using the resources that you should be

12:10.480 --> 12:18.720
using efficiently. So the idea there was, okay, how can we fit that in? Well, we use this cluster

12:18.720 --> 12:24.960
copy technology that we have. We've write some form of rule mechanism that operates on each

12:24.960 --> 12:31.680
job archive and assesses whether or not that particular job is where the or is or not well,

12:31.680 --> 12:37.600
where there is maybe a wrong turn is good about. And what do these rules look like? Well,

12:38.400 --> 12:45.520
it is a practical system. It's not very sophisticated. It's basically it's Python evaluated

12:45.600 --> 12:51.440
in JSON files. So do what do we have here? Well, we have a tag or a description of what this is.

12:52.000 --> 12:59.520
In this case, this is a low CPU load. So when the CPU is not fully used, then that should

12:59.520 --> 13:05.120
alert us. The second part of the rule is, okay, we need some form of configurable threshold,

13:05.120 --> 13:08.960
because we don't want to adjust all of these rules for each cluster to it that it might have.

13:08.960 --> 13:13.200
So this is from a configuration file. Here we say, okay, we want to have an alert. If this

13:13.200 --> 13:19.680
goes below 90%, for example, and then this one specifies, okay, which metrics from the

13:19.680 --> 13:25.360
job I've job archive needs the rule engine to pull the rule engine actually to evaluate. And then we

13:25.360 --> 13:32.960
have variables that are evaluated at top of the bottom. Basically defining certain thresholds,

13:32.960 --> 13:39.360
being able to use all of the Python back math, whatever you want to use. And then in the end,

13:39.360 --> 13:46.880
we have some form of output tricker in this case. Once the low load is true or above a certain

13:46.880 --> 13:53.520
threshold, then we consider that rule to have failed and then we want to notify the user.

13:53.520 --> 14:00.240
And how do we do that? Well, there is what we call it at M template that is filled with the data

14:00.240 --> 14:06.480
that this rule kind of generated. And then what we do we do? Initially we thought about sending

14:06.480 --> 14:11.680
emails to the user. And I'm going to talk about that in a minute. But what kind of rules do we have

14:11.680 --> 14:18.320
at the moment? Well, we have low CPU load, we have low load imbalance, we have excessive CPU load,

14:18.320 --> 14:26.080
we have memory use problems. So a lot of different rules that help us kind of get a gist of what

14:26.080 --> 14:32.320
the users are doing wrong. A brief overview. So how did we fit that into the cluster cockpit

14:32.400 --> 14:38.000
set up? Well, we simply wrote a little script that whenever there's a job archive on the disk,

14:38.000 --> 14:44.640
a new job archive, we analyzed that one and we put that information about that to the output.

14:46.240 --> 14:53.360
We deployed that in Darmstadt, in Allang, in Padaborn, and in Berlin. So there are four sides using

14:53.360 --> 14:58.480
this kind of mechanism at some moment and we are still filling adjusting it so it suits us best.

14:59.040 --> 15:06.480
But the key inside that we got is once we saw how many jobs actually had problematic behavior.

15:06.480 --> 15:16.720
It's like 50%, 20%, 30%, we kind of thought, no, we are not going to send emails because

15:16.720 --> 15:23.680
if you have a job load like what we have is like we have 250,000 jobs per month,

15:24.640 --> 15:31.520
sending 125,000 emails per month would make us email spammer and we didn't want to do that.

15:31.520 --> 15:36.160
So we really never really implemented sending the emails. What we do at some moment,

15:36.160 --> 15:42.240
we have text files where this is dumped to that we then analyze a specialist and we tag the jobs in

15:42.240 --> 15:48.480
the job archives so that the GUI once it gets to the point can actually display for each job

15:48.480 --> 15:55.280
also those kind of failed metrics or failed rules that we have specified so that when

15:55.280 --> 16:00.000
a user looks at his own jobs, he can get an assessment. Okay, I'm doing a great job because

16:00.000 --> 16:04.480
there's nothing failed or I'm doing that a batch of using the resources when there are

16:04.480 --> 16:12.880
lots of red flags at the top of his screen. Briefly commenting on the experience of how

16:13.600 --> 16:18.640
the wasn't so perceived since I wasn't initially part of the cluster cockpit development team.

16:18.640 --> 16:25.840
I had a great on us of deploying the system at our university and in principle it is very easy

16:25.840 --> 16:31.600
to deploy they have on they have a good web page and there are binaries on the web page

16:31.600 --> 16:37.840
that you can basically drop in if you trust the binaries, you can start the binaries and

16:37.840 --> 16:44.080
everything is really easy. However, it gets a little bit tricky once you get to the point of that

16:44.080 --> 16:48.880
the system needs to know what kind of cluster you're running on. It doesn't magically infer what is

16:48.880 --> 16:55.360
the peak floating point capability of a particular CPU. It doesn't infer what kind of memory is there.

16:55.360 --> 17:01.200
You need to, there are tools in there that benchmark all of this stuff and put that into configuration

17:01.200 --> 17:08.000
files but you still need to bundle it together in terms of okay I have a cluster and in that

17:08.000 --> 17:13.840
cluster consists of two or three phases each phase has a different configuration so you need to create

17:13.840 --> 17:18.880
a hierarchical structure and for each of those you need to specify what is the peak performance

17:18.880 --> 17:25.760
that this has in all of the metrics you want to have evaluated and then it gets a little bit more

17:25.760 --> 17:32.240
complicated because all of the free instances need to have this matching configuration so for

17:32.240 --> 17:37.280
example if the metric collector collects stuff that the metric store does know about the metric

17:37.280 --> 17:42.640
store just rejects the information and dumps it so it doesn't pertain it so the metric store needs

17:42.640 --> 17:47.680
to know which metrics to keep ideally the one it gets from the nodes and then the cluster

17:47.680 --> 17:52.960
cockpit back and also needs to know which metrics it can pull from the metric store so at the moment

17:53.040 --> 17:58.080
there is no automatically to couple all of these things together it's multiple configuration files

17:58.080 --> 18:03.120
and you need to have a little bit of discipline not to mess up the names and whatnot but

18:04.560 --> 18:12.960
once it gets started it is fairly well to use so in summary a little bit of the challenge we are currently

18:12.960 --> 18:18.800
aiming to address hopefully is that all of these entities shown here have their own individual

18:18.880 --> 18:24.880
configuration files and those configuration files need to match up it's not comfortable at some

18:24.880 --> 18:33.520
moment but it's doable and as a last take away and we have a little bit time to spend the end

18:34.240 --> 18:41.200
I would consider cluster cockpit and patho jobs as very powerful tools to have on your cluster

18:41.200 --> 18:48.240
because on the one end it enables you as a user like it enables users to inspect and reflect on their

18:48.240 --> 18:54.160
jobs on a per job basis without having to do anything they have the information they can log onto

18:54.160 --> 18:59.600
the system and they see their jobs they can click on the jobs they can see what the consumption

18:59.600 --> 19:03.920
of certain resources where they can see whether or not they're using the memory they can check

19:03.920 --> 19:09.440
on which cores did they run did they leave some cores out was the core mapping according to plan

19:09.440 --> 19:15.440
so this kind of information is available for the users for me as an HBC support person aiming to

19:15.440 --> 19:22.320
improve the capabilities of our users I can have a look at every user's job once configured with

19:22.320 --> 19:28.480
the right privileges and then for example SS okay this user needs some let's say counseling in

19:28.480 --> 19:33.440
terms of how do you allocate your resources how can you make sure your resources are actually

19:33.440 --> 19:39.120
used and for example if I get tickets like hey my job brand very slowly can you tell me what's going

19:39.120 --> 19:44.880
on I can open cluster cockpit go to the past look at this job and see for example oh that one

19:44.880 --> 19:52.000
particular note it had a cooling problem and the CPU clocked down because again I'm also tracking

19:52.000 --> 19:58.880
the clock rate that the CPU do over time so I can match that with buck reports or tickets so

19:58.880 --> 20:04.080
that helps me as a support users and for the admin staff they get a lot more information of what

20:04.080 --> 20:09.360
actually is used on the system what the system is required and what the users actually make use of

20:10.000 --> 20:17.520
and that's the end of this presentation I'm happy to answer any questions you may have

20:17.520 --> 20:24.400
and give my own user experience of the system thank you

20:28.000 --> 20:35.440
question for questions yes right how does it scale up what how much data are you taking for

20:36.400 --> 20:42.800
so the question and each section would get its individual metrics store and again you could put

20:42.800 --> 20:48.400
some form of a hierarchy in there but that is a manual process at some moment and in particular you

20:48.400 --> 20:56.960
need to be able to tell at the at the back end level okay if I'm on that note I need to contact that

20:56.960 --> 21:05.440
metric store and at the moment this is done by ranges so ideally no zero to 99 is metric store one

21:05.840 --> 21:13.120
100 to 190 metric store number two you need to have something like that but since this is open source

21:13.120 --> 21:20.320
you could go in and implement your own mapping strategy at your own risk

21:26.960 --> 21:42.880
so that was also one of the questions I asked those guys and there's two answers and so

21:42.880 --> 21:47.280
a number and so the question is why do you implement why didn't we implement that using

21:47.280 --> 21:52.560
existing technology from ethios and graphana and so number one at least that's what I'm told

21:52.560 --> 21:59.040
graphana has the problem how do you access user privileges how do you infer okay this user is allowed to see

21:59.040 --> 22:15.040
only that fraction of jobs that's what I'm being told again that was mostly it's so the counter

22:15.040 --> 22:20.640
answer is like you can put in infrastructure in profit and profit is in graphana to mitigate that

22:20.800 --> 22:26.880
and since young icing at the initial design of cluster cockpit is not here I cannot answer that

22:26.880 --> 22:37.600
however from his so let me rephrase the question hope I get it right did we validate our results

22:37.600 --> 22:42.960
with information from profiling on tracing runs that's it's a moment we have a bachelor thesis running

22:42.960 --> 22:49.200
which is precisely doing that particular part of the of the validation part so that the idea is that

22:49.200 --> 22:54.640
he goes in submit jobs using classic performance analysis techniques like instrumentation

22:54.640 --> 23:00.640
or even benchmarks that we know from the validation phase of the cluster what the benchmarks

23:00.640 --> 23:05.680
presuits produce in terms of data and then we go back and see whether or not that information is

23:05.680 --> 23:12.880
actually depicted in the cluster cockpit itself however since my intuition is since both of them are

23:12.960 --> 23:19.120
using the same gathering techniques like hardware performance data it should be the same unless

23:19.120 --> 23:24.560
there is somewhere a software buck in the system where we just do gross mistakes and just use

23:24.560 --> 23:32.160
wrong data fields or whatever but again there's a thesis going on to validate that and to be frank

23:32.160 --> 23:37.760
since that system is in production at some moment at 4 HPC centers and we haven't seen anything

23:37.840 --> 23:47.120
overly strange I would say it's we can say with some sense of confidence that it is working

23:47.120 --> 23:53.760
in a reasonable range I can say to what degree is this as precise because again we are not doing

23:53.760 --> 23:58.800
anything particular to make it very precise but it should since we're sampling on a minute level

24:00.160 --> 24:06.240
you don't see that much anyways you can see a lot but the fine detail that you see in let's say

24:06.320 --> 24:12.080
a good perform an instrumentation for trade with let's say MPI calls in there what not that's not the

24:12.080 --> 24:18.240
level of granularity that the system is aiming for

24:18.240 --> 24:36.080
right thank you Christian there was so much so the question was is the system more popular

24:36.080 --> 24:48.080
with the consultants over the users I have a problem of answering that because on the one

24:48.800 --> 24:54.960
the consultants love it because they can see what the users are doing the users are very appreciating

24:54.960 --> 25:00.320
when you get in contact with them and are actually able to show them what their jobs we're doing it's like

25:01.600 --> 25:06.640
I don't it's not anymore let just believe me I've seen this stuff in I have a screenshot here it's

25:06.640 --> 25:12.320
like here log in use your credentials this is your job you can browse it you can analyze it you can

25:12.320 --> 25:18.400
submit another job and compare that it's that is a very powerful aspect you can actually

25:18.400 --> 25:23.440
decide by side comparisons of your jobs as a user without having to know how do we instrument

25:23.440 --> 25:27.840
messed up where do we place the data how do they use a data exploration system like vampia

25:28.960 --> 25:35.920
it makes things easy and for that for that reason I would say it's a grivalent good for both sides

25:36.080 --> 25:46.800
yet

25:50.320 --> 25:52.080
you

