WEBVTT

00:00.000 --> 00:11.360
Yeah, hello, welcome to our talk on zero code distributed traces for any programming language

00:11.360 --> 00:14.360
with EVPF, so that's the topic.

00:14.360 --> 00:22.240
A few words about us, so I'm Fabian, I am engineering manager at Grafana, also active in open

00:22.240 --> 00:26.760
source, a contributor to Prometheus, and we have Raphael.

00:27.680 --> 00:28.680
Does that work?

00:28.680 --> 00:33.760
Yeah, so Raphael, I'll work at the Raphael and Raphael, it's a software engineer, also

00:33.760 --> 00:40.000
back running open source, mostly involved with the QT project in the past, and yeah, here

00:40.000 --> 00:41.000
we are.

00:41.000 --> 00:45.360
All right, and the way the structure at this talk is I will do more of the introduction

00:45.360 --> 00:50.520
over here and say like what problems we are trying to solve, and then Raphael is the actual

00:50.520 --> 00:55.520
engineer, you know, we'll talk more about how we solved it, and he will do the main part

00:55.520 --> 01:01.400
of the talk, I think, so let's see, but for introduction, the tool we are going to use

01:01.400 --> 01:07.880
is named Grafana Bayler, it's an EVPF based audio instrumentation tool, it's open source,

01:07.880 --> 01:14.440
it's on GitHub, it's currently under the Grafana GitHub organization, licensed under Apache

01:14.440 --> 01:22.720
2.0 license, however we have started a process to donate it to open telemetry, it's currently

01:22.720 --> 01:28.160
being reviewed by the open telemetry technical committee, and once this review process is

01:28.160 --> 01:34.000
done, our intention is to move it over to the open telemetry project, within open telemetry

01:34.000 --> 01:38.720
it will likely not keep the name Bayler, it will be named something like open telemetry

01:38.720 --> 01:41.960
EPPF instrumentation or something like that, right?

01:41.960 --> 01:47.160
So that's the ticket, like if you want to follow the status of that.

01:48.080 --> 01:53.720
Yeah, and I guess what is Bayler, I mean, if you want to show a tool, like the best thing

01:53.720 --> 01:58.600
you can do is a demo, even if it's back and tool, so there's probably not too much

01:58.600 --> 02:05.640
to demo, but what I did, I had a couple of go services running in the local Kubernetes

02:05.640 --> 02:11.640
cluster on my laptop, installed Bayler as a demon set, told it to instrument everything

02:11.640 --> 02:16.440
in my default namespace in Kubernetes, it found these go services, so there's a couple

02:16.440 --> 02:21.000
of them, and what we see here is a pretty standard, you know, graphana dashboard with

02:21.000 --> 02:26.600
the three most important metrics that you would like when you monitor HTTP services,

02:26.600 --> 02:35.480
reclust rates, error rates, and durations, how long things take, right, and what, so this

02:35.480 --> 02:40.440
is all metrics, right? It has nothing to do with tracing, but what tracing comes in is,

02:40.440 --> 02:46.040
if you, for example, see that the error rate here of my product service is more than 20%,

02:46.440 --> 02:51.240
you might want to learn what's the reason for that, right, and in a distributed application,

02:51.240 --> 02:55.800
of course, the root cost might not be in the product service itself, but some are downstream,

02:55.800 --> 03:01.080
like whatever, it's calling, and so this is where tracing comes in, so you go to your trace

03:01.080 --> 03:09.400
database, run a query, for example, you would say I'm looking for any stand within HTTP response

03:09.400 --> 03:16.840
status code of 500, then you get examples of erroneous calls, right, and if you look at

03:16.840 --> 03:23.240
such a trace, so it's maybe close that window so that we have a little bit more space,

03:24.280 --> 03:29.800
so you see as indicated by this red symbol here that there is an error in the product service,

03:29.800 --> 03:34.280
but if you scroll down you see that this product service also calls the pricing service a couple

03:34.280 --> 03:39.720
of time, and that's also an error down here, right, and so if you open the details, you can

03:39.720 --> 03:46.760
you know have a look and see, you know, that it's an HTTP request and which part is called,

03:47.560 --> 03:55.240
but now it cannot scroll down anymore, but you get the picture, right, and so the reason

03:55.240 --> 04:02.280
I am showing you this is, you know, to just let you know that this talk is about distributed tracing,

04:02.360 --> 04:05.800
and also the previous one, and there's a lot of talk about distributed tracing,

04:06.600 --> 04:11.320
metrics are still, I think, the core, like the most important signal, right, metrics are what you

04:11.320 --> 04:16.760
have on your dashboards, what you define your alerts with, what you use in your SNOs, and so forth,

04:16.760 --> 04:24.280
and metrics with EDPF works pretty great, so we have support for a lot of natural protocols,

04:24.280 --> 04:31.000
HTTP, HTTP, HTTP, HTTP2s, and so forth, right, a huge list, we are adding more and more to it,

04:32.280 --> 04:38.440
it works for any programming language, so I listed a few here, but actually I think as of today,

04:38.440 --> 04:43.080
I'm not aware of any language where it doesn't work, so if you have any language where you

04:43.080 --> 04:48.520
wouldn't get proper metrics, open a issue, we will figure out why, but it's actually pretty

04:48.520 --> 04:53.400
pretty stable, and of course there's not only network metrics, we have process metrics like whatever

04:53.400 --> 05:00.520
memory usage you use it and that stuff, right, so if you do tracing, I mean because if you

05:00.520 --> 05:04.920
have seen if you actually want to go to the root cause of things tracing is pretty useful,

05:05.640 --> 05:12.280
tracing with EDPF is harder than metrics with EDPF, right, so why is it hard? So basically,

05:12.920 --> 05:18.520
you need to do three things if you want to do tracing with EDPF, so we have a service here,

05:18.680 --> 05:25.160
it gets incoming HTTP requests, does something, and at some point performs an outgoing HTTP request,

05:25.880 --> 05:30.760
the first thing you need to do, and then HTTP requests is coming in, you need to figure out if

05:30.760 --> 05:36.040
it already is part of an existing trace, which you learn by looking at the HTTP headers,

05:36.040 --> 05:41.400
there's a trace header with the trace ID and the parent span ID, and this is done, right,

05:41.400 --> 05:47.160
we can look into headers of incoming requests and parse these and figure out the current span

05:47.560 --> 05:54.600
context. The second part, the service does something and you need to keep track, you know when the

05:54.600 --> 06:00.040
service does an outgoing request, which incoming request is the context that this belongs to,

06:00.600 --> 06:07.880
and this is depending on the framework that you use, this is hard to do with EDPF, so we solved this

06:07.880 --> 06:13.160
for a couple of frameworks, first of all if you have just a single threaded application, that's

06:13.160 --> 06:18.520
easy, right, if you have the standard Java server model with one thread per request, that's also

06:18.520 --> 06:24.120
easy, if you have multi threading it gets harder, so we solved it for goals, so if you start a couple

06:24.120 --> 06:28.600
of goal routines and they are scheduled and so forth, we track these goal routines and can still know

06:28.600 --> 06:34.760
what the original context was, so multi threading with goal works, we instrumented no JSAsian

06:34.760 --> 06:40.840
groups and so forth, so there are a couple of frameworks where we reliably know which outgoing requests

06:40.920 --> 06:45.320
belong to which incoming requests, but we have to be honest, it doesn't work in all cases,

06:45.320 --> 06:53.320
so for example, if you have a fancy, reactive Java application with some kind of framework,

06:53.960 --> 06:58.840
there's a lot of scheduling back and forth on thread pools, we might lose track, so it's basically

06:58.840 --> 07:04.200
depends a little bit on the framework that you use, whether we can track these requests or not, right,

07:04.920 --> 07:10.040
and then the last part is, if you figured out the trace context that belongs to this outgoing

07:10.120 --> 07:16.840
request, you need to add the trace ID and the parent spend ID to the outgoing requests, so that the

07:16.840 --> 07:23.080
next service downstream knows which trace this request belongs to, and this is also hard to do,

07:23.080 --> 07:28.360
but this is something that we've solved in the previous couple of months, and this is what

07:28.360 --> 07:33.960
this talk is about, right, and now I'm handing over to Rafael, who will try to do all the details,

07:33.960 --> 07:37.880
like what we did, what proper problems we ran into and so forth, so take fun,

07:38.680 --> 07:46.040
all right, so as I've been just explained to you, we need people, what one is to propagate

07:46.040 --> 07:51.400
the context, so this is the sequence diagram of the demo that Fabian just showed you, and we have

07:51.400 --> 07:59.080
all these requests going on, they are not known dashed arrows, and so what needs to go around

07:59.800 --> 08:06.440
basically two things, the spend ID and the trace ID, as you know, spend ID, identify each individual

08:06.520 --> 08:11.000
request, and the trace ID is used to tie them together, so we have one ID that

08:11.720 --> 08:16.440
changed everything together, so that's one of those what it looks like, we got

08:19.000 --> 08:25.320
header, HTTP header that gets injected, it's defined by the W3C, it's called trace parent,

08:26.840 --> 08:32.120
so it's made of the vendor-infold trace ID, the spend ID and the flights, and then we end up

08:33.080 --> 08:42.600
with these big string there, that gets injected into the HTTP header, and how is that done,

08:42.600 --> 08:48.280
like how do we pass that around with all the instrumentation, or not, like how can we inject that?

08:49.000 --> 08:56.440
Well, the obvious way is to use an SDK, right, there's some protocol there, you know, some API,

08:56.440 --> 09:01.240
and we do it, but, you know, you promise this, you're a code instrumentation,

09:01.240 --> 09:04.840
it's a programming language, so that's not what we're going to be talking about,

09:07.160 --> 09:13.640
and that's where Bala comes into play with EBPF, just being curious here, who's familiar with

09:13.640 --> 09:19.000
EBPF and knows what it is, they could be, okay, that's best trait, for those who don't know,

09:19.240 --> 09:27.960
very, very, very constrained terms, it's a Linux kernel infrastructure that let us run,

09:27.960 --> 09:33.880
we get small programs, they are very safe, they go to a very fire, and these programs,

09:35.240 --> 09:40.680
they give us some kind of visibility inside the kernel, and we can attach them to different

09:40.680 --> 09:47.560
parts of the kernel, they're different kinds of EBPF programs that do different rows, and we use that

09:47.560 --> 09:55.880
to correlate the requests, and also to inject a trace parent. So, basically, we have too simple

09:56.760 --> 10:02.120
requirements to do that, one is probably becoming a request, as I said, and writing memory,

10:02.120 --> 10:07.480
so writing trace print, had that level, EBPF level were basically writing process memory,

10:08.040 --> 10:16.040
and that's, yeah, that's where the problem begins. So, it's kind of difficult to write process

10:16.040 --> 10:23.240
memory with EBPF, so, I mean, this is just a verbose code, it's an example of how we do it with

10:23.240 --> 10:29.240
go, we don't have to really understand it, the takeaway here is that EBPF programs, they don't have,

10:30.120 --> 10:35.880
you cannot use a lead C or anything like that, you can only use a set of predefined helpers,

10:35.880 --> 10:42.760
one of them is called BPF ProGrid user, and in this case, we're using this to inject the trace

10:42.840 --> 10:54.600
parent's string for a go application, but this helper is a bad idea. It was added back in 2016,

10:55.160 --> 11:03.400
to the Linux kernel, but the main goal was just to help people messing with EBPF to make experiments

11:03.640 --> 11:12.440
or, you know, debug things, and because you're writing user space memory, you can crush your program

11:12.440 --> 11:19.960
or do bad things, and also you need capsis, that means which means you need root,

11:21.160 --> 11:28.200
so, you know, security implications, and yeah, you will see this helper in a few

11:28.360 --> 11:33.800
auto instrumentation projects like bail out or tell go, but mostly in malicious skills,

11:33.800 --> 11:41.640
if you're going GitHub, so because of that, it was locked down in August 2021 under the Linux

11:41.640 --> 11:48.520
security mode, meaning that for production, it's really difficult to use it, I mean, it's been

11:48.520 --> 11:56.120
used, it works, but, you know, it has a lot of downsides, so we're starting looking for alternatives,

11:56.120 --> 12:06.280
like, what can we do? Well, that EBPF probe right user, it works in one kind of EBPF program

12:06.280 --> 12:12.840
called probes that get, they get attached to, it's like a breakpoint, making it very bad analogy,

12:15.240 --> 12:20.040
that you get attached to a function in your program or a kernel function. There is a different

12:20.040 --> 12:29.480
kind of program called classifier program, BPS program in me, that lets us see network packets,

12:29.480 --> 12:34.280
so it's no longer, we're not longer dealing with functions or kernel functions, but it

12:34.280 --> 12:40.040
sits really low level on the network stack, close to the, you know, the data path, close to the

12:40.040 --> 12:46.520
network card or the, you know, lowest level, and we can analyze the contents of packets, and

12:47.000 --> 12:51.800
these kind of programs, they let us change the contents of the packets, so right packet to memory,

12:51.800 --> 12:56.280
so not program memory, nor user space memory, but just messing with the packet, so

12:57.720 --> 13:01.960
please, the idea then, what we're trying to do, we see all these packets going through

13:01.960 --> 13:08.760
the on EUS path, so the outgoing packets, so that those are the outgoing HTTP requests that

13:08.760 --> 13:14.440
fragment was talking about, we see, okay, do they have a HTTP header there, are they a HTTP,

13:14.600 --> 13:21.640
request cool, then we need to punch a hole on that HTTP header and then write the trace

13:21.640 --> 13:27.720
transparent string that be, that be string, too easy, right, but as you know, famous last words,

13:27.720 --> 13:35.160
because it's never that easy, so our first attempt was to do something like this, so here's like

13:36.120 --> 13:43.240
a TCP packet, it has all these headers, the Ethernet IP, TCP, and then on green there,

13:43.240 --> 13:49.880
you have the HTTP header, there is a helper, it can be used with the programs called change head,

13:49.880 --> 13:57.640
so the idea was to extend the head of the packet, and then we push, we copy, we find where the

13:57.640 --> 14:03.080
HTTP header starts, we push it up and making room for our trace parent, and then we write it,

14:03.080 --> 14:09.080
cool, not really, because turns out the kernel, it keeps track of the original offsets or of

14:09.080 --> 14:16.120
where these headers are, the Ethernet IP and whatnot, so when we adjust it, the kernel also adjust

14:16.120 --> 14:21.960
the offsets, and then we manually copy and push it up, we mess with that, what happens then is that

14:21.960 --> 14:27.720
when we try to send out the packet, it gets dropped, so it doesn't work, so we need to do something

14:27.720 --> 14:34.120
else, and what we do instead is we change the tail of the packet, so it's a different helper,

14:35.480 --> 14:40.440
has a bit of downside that we need to copy all the payload down of that packet, and we inject the

14:40.440 --> 14:52.760
trick, where it's right, no, so we did that, the packet goes out, so just a digression here,

14:53.640 --> 15:01.720
so the way TCP works in really small, not shell, is that for to make sure the nothing gets

15:01.720 --> 15:07.080
lost in the way, we have something called TCP sequence number, so it basically you send a TCP

15:07.080 --> 15:14.920
packet out, and then your peer replies, okay, I've received these many bytes, this is the sequence

15:14.920 --> 15:20.280
number, so if you're my traffic, I've already sent out, let's say, 100 bytes, then I have the

15:20.280 --> 15:25.720
original packet here, that's 50 bytes, I'm just making these numbers up, and I send it out,

15:25.720 --> 15:32.040
in total I have already transmitted 150 bytes, and then I get a anac packet saying, okay,

15:32.040 --> 15:40.280
anac 150, when we do this, this happens at the really end of the stack of the network stack,

15:40.280 --> 15:45.240
we exchange the tail, the carrier doesn't know about that, it's just messing it, so now we're

15:45.320 --> 15:51.400
no longer, let's say 50 bytes, we have 60 bytes, we send it out, and then in total we have

15:51.400 --> 15:59.800
already transmitted, instead of 150, we transmitted 160 bytes, and we get anac 160, and then

15:59.800 --> 16:05.320
the packet traverse up the network stack, but the carrier says, oh, but I was waiting for 150,

16:05.400 --> 16:17.240
now this is bad, drop it, so what do we do? Well, crazy idea, I mean it, first attempt it was an

16:17.240 --> 16:24.600
experiment we tried to, okay, kind of fudge with the kernel, so we keep track of how many bytes

16:24.600 --> 16:32.840
we have extended throughout the TCP connection, and then when we get anac back from our peer,

16:32.840 --> 16:38.920
TCP peer, we adjust that, we subtract those bytes again, and we fool the kernel, that works,

16:39.640 --> 16:47.960
but you know it's not very robust, if you need to have like bailer running there, keeping tabs

16:47.960 --> 16:53.720
on what's going on, and if bailer goes away, then you messed up all your connections, and you know,

16:54.520 --> 17:00.920
it's not great, so we figure out a more robust approach, so I was telling you about

17:01.880 --> 17:07.800
the BFF as different kinds of programs, so far we've been using one that's product types,

17:07.800 --> 17:13.400
cat CLS, or the classifier program, but there's a different kind of BFF program called socket

17:13.960 --> 17:24.200
filter, so that's the first one there, SKMS MSG, so this program sits very high up on the on the

17:24.280 --> 17:32.280
that a path of the package on the network stack, we cannot write package data, so we cannot

17:32.280 --> 17:39.400
mess with the contents of the packet, but we can add space in the packet, so it lets us use a

17:40.600 --> 17:49.480
different kind of helper, so called message push data, that lets us expand the packet in arbitrary

17:49.480 --> 17:55.560
offset, so a BFF is all about which program you're running and the set of helpers this program

17:55.560 --> 18:03.880
is allowed to use, so we do a two-tier approach, right, we use the socket message

18:05.320 --> 18:10.600
program to punch the hope we want, and then the kernel is aware of that, because once the kernel

18:10.600 --> 18:15.560
is reached the part of the kernel where it starts dealing with the offsets of the headers I talked

18:15.640 --> 18:23.640
about before, it accounts for that, and then once we receive that down the data path on the original

18:23.640 --> 18:30.280
classifier program all it has to do is to write a transparent string, so that works pretty nicely,

18:30.280 --> 18:39.080
but obviously it does some downsides here, first of all we're looking at strings writing data,

18:39.080 --> 18:44.760
it's all plain text, or we need to be able to parse that, so we doesn't work in cryptid traffic,

18:45.800 --> 18:53.080
this might be, it might be a way out in the long term with KTLS, so that's TLS support in the kernel,

18:53.080 --> 18:58.920
then we could attach EBFF programs to strategic points there where we could still see the data

18:58.920 --> 19:05.160
before it gets encrypted, but it's not the case yet, and we don't have HTTP to support it in

19:05.160 --> 19:12.680
beta yet, we're working on that, it's a bit more complicated, so we have an alternative,

19:12.680 --> 19:19.160
so they are complementary, we want to be robust, we want to be able to propagate the trace context,

19:19.160 --> 19:26.840
and we encode the trace parent in the IP header, so that's the other idea, so the IP header,

19:27.000 --> 19:39.080
it has a few fields we cannot use, so in the right-hand side I have an IPv4 packet or the layout,

19:39.080 --> 19:47.000
and this is IPv6, so IPv4 has a section called options, and IPv6 has also a section called options,

19:47.000 --> 19:56.280
but it works a bit differently because it's just a bunch of embedded headers in the layout,

19:56.840 --> 20:05.000
so what we do here, so I'm going to show you IPv4, but it's the same logic applies for IPv6,

20:05.000 --> 20:10.840
so here's like a pseudo-sympified version of the same packet, we have the ethernet header IPv4,

20:10.840 --> 20:18.040
TCP, and payload, and then the classifier program can also use a different helper called

20:18.200 --> 20:28.600
adjust room, this helper is specifically made for operating in this headers like IPv4 or ethernet,

20:28.600 --> 20:38.280
and it's only done for that, you cannot adjust room on payload or TCP, so we basically insert

20:39.240 --> 20:49.160
a space for a new IP option, so right IPv4, the option part of the packet goes

20:49.160 --> 20:55.960
right in sequence, so we insert that option, and then we encode the transparent there, but

20:56.920 --> 21:02.200
we don't have a lot of space there, so we cannot encode everything,

21:03.160 --> 21:09.240
can I encode the big screen, or anything like that, so we have to be a bit clever about that,

21:09.240 --> 21:14.520
so we don't care about vendor info, we do not encode the span ID, we do not encode the flags,

21:15.320 --> 21:23.880
we need only 20 bytes, so basically the way in IPv4 it's encoded is we write the option ID,

21:23.880 --> 21:30.760
so we use an option called stream ID, it's rarely used, so hopefully it doesn't conflict with anything,

21:32.200 --> 21:36.600
and then we encode the length, so we need 16 bytes, that's the trace parent that we want to

21:36.600 --> 21:41.160
propagate, and then some padding, because it needs to be a multiple of four, but then you won't

21:41.160 --> 21:46.840
ring okay, what about the span ID, right, we're not passing it along, well we don't have to,

21:46.840 --> 21:55.800
because it's just a unique and random number, so on AWS, we need two kinds of programs,

21:55.800 --> 22:00.520
I was only talking about AWS, injecting things, but now we need an ingress program, so on the

22:00.520 --> 22:07.320
other end of the data path, when we're receiving the packets, we need to be able to see if there is

22:07.320 --> 22:15.960
one of those options encoded in my packet, and be able to parse it, and then on ingress,

22:16.600 --> 22:22.440
we generate the span ID at that stage, and we derive it from the TCP length on the TCP packet,

22:22.440 --> 22:26.520
the length number, and the EC number, the inflame was that number they gave us the problems before,

22:26.520 --> 22:33.000
so eight bytes each together, 16 bytes, and we have our span ID, and Baylor remembers that, and

22:35.320 --> 22:45.240
it keeps steps on things, so there are obviously caveats with that, we require EBPF agents or Baylor

22:45.240 --> 22:52.440
to run on both sides now, whereas with injecting the trace parent on the HTTP header, it plays

22:52.440 --> 22:56.760
nicely with all the instrumentation, and manual instrumentation, because it can just parse it,

22:59.720 --> 23:07.240
we can maybe depending on your network topology, it can be stripped by

23:08.600 --> 23:12.920
network layer features, it doesn't work with out seven boxes, because of that as well,

23:12.920 --> 23:18.120
unless it's instrumented, because then we can re-inject it, so you need to test it, usually works,

23:18.120 --> 23:24.840
but you know, famous as words as I say, we are looking into TCP, encoding these things into TCP options,

23:24.840 --> 23:31.880
and TCP header, we had a few setbacks in the past, but we figured out a different way we're working

23:31.880 --> 23:36.440
that to see if that works, because that would solve this, they usually don't get stripped,

23:38.040 --> 23:45.400
and this is a, well, there's a summary of comparing both approach, so the original HTTP header approach,

23:45.480 --> 23:51.880
it's a multi-tier approach, two kinds of EEPF programs, it plays nicely with center open telemetry

23:51.880 --> 23:58.600
instrumentation, because you can inject manually your trace parent, Baylor, we know about it,

23:58.600 --> 24:02.360
we can Baylor, we then inject itself, we are SDK, we know about it, so it's like very,

24:02.360 --> 24:08.360
pretty much transparent, not suitable for HTTP traffic, and it's still working HTTP support,

24:09.320 --> 24:15.320
IP header, it's not compatible with standard open telemetry instrumentation, but, you know,

24:15.320 --> 24:20.040
can work with encryption traffic and requires the instrumentation for all services involved,

24:21.960 --> 24:29.400
and finally, I mean, we just had a Baylor release now to that zero that brings all of these goodies

24:29.400 --> 24:32.600
there, but you know, plenty of work to do, so we encourage you to get involved,

24:32.920 --> 24:39.960
it's everything is open source, you can find those in Github, and also, we have on Refinite

24:39.960 --> 24:45.320
Public Slack and the Baylor channel, and yeah, I guess, that's pretty much it.

24:57.320 --> 24:58.040
Any questions?

25:03.560 --> 25:08.920
Can you also, if you go, I'll just keep it quiet, so we can hear the questions.

25:26.840 --> 25:32.120
Yeah, I'm really interested in, like, this is, I didn't hear you, okay, can you hear me now?

25:32.920 --> 25:33.320
Yes.

25:35.400 --> 25:42.280
Okay, this was awesome, thank you very much, and I'm already using this, I really appreciate it.

25:43.640 --> 25:50.760
I am interested, particularly in things that can auto, or can already be instrumented,

25:50.760 --> 25:56.520
auto instrumented using the traditional stuff, and obviously, on the Baylor docs,

25:56.520 --> 26:02.520
you guys talk about instrumented for example, Java applications, is there something that I

26:03.480 --> 26:09.160
am missing that there's an advantage to using EPPF in those situations, or is it more just

26:09.160 --> 26:13.960
in cases where you can't use that auto instrumentation, because you can't get the agent in.

26:20.200 --> 26:28.520
Great question, so the short answer is, for Java, it makes more sense to use the Java

26:28.520 --> 26:33.960
instrumentation agent by the Open Telemetry project, because that's a feature really,

26:33.960 --> 26:38.600
by the JVM where you can hook in, get a lot more details, a lot more internets, and so forth.

26:38.600 --> 26:44.920
So if you have Java, and you have the chance to attach the Open Telemetry Java agent to your

26:44.920 --> 26:51.080
JVM, then that will give you more insights than Baylor instrumentation. There are a few cases where

26:51.080 --> 26:56.120
you cannot do this. The obvious one is modern Java, you can compile to native executables,

26:56.120 --> 27:00.680
so if you do that, then there's no JVM, and then you are lucky that you have Baylor as an

27:00.680 --> 27:07.080
alternative. The other thing is a little bit more subtle. Usually, you cannot have more than one agent.

27:07.080 --> 27:12.040
I mean, you can't technically, but agents prevent that, because they're conflict. So if you

27:12.040 --> 27:18.840
already have whatever some proprietary agent attached to your JVM, and then it refuses to

27:18.840 --> 27:25.640
let the other agent attach as well, there might be situations where all also EPPF is the only way out.

27:25.640 --> 27:31.880
But if you're able to just attach the Open Telemetry Java agent for the specific case of Java,

27:31.880 --> 27:33.800
I think it's the best option.

27:34.120 --> 27:40.440
That's really great. I'm so thank you.

27:48.440 --> 27:49.000
One, three.

27:53.560 --> 28:00.600
Okay. First of all, it's great to great presentation. Thanks. I have a question about if I don't

28:00.600 --> 28:07.880
want to propagate the spans forward. For example, it's a street called. If you start to mess up

28:07.880 --> 28:15.560
your headers, you need to also try to adjust the design things and so on. The designs of the headers

28:15.560 --> 28:24.120
and so on. For example, I want to match the size of how long did it take from and to end.

28:24.120 --> 28:30.840
So from request to response. So can it be used only for just creating that the mapping goes

28:30.840 --> 28:38.920
guys like request and response without instrumentation and just put the spans out to some systems to

28:38.920 --> 28:39.560
track those.

28:41.320 --> 28:44.680
Sorry, can you repeat the question? It's really hard to hear from here.

28:46.520 --> 28:53.560
Okay. Is it possible to map the request and response without propagating the

28:54.440 --> 29:01.480
response inside. So for example, it's a street calls. I want to just just track the amount of

29:01.480 --> 29:08.120
time which I spent in the history. And for sure, I don't want to have the mess up with all

29:08.120 --> 29:15.000
headers because they're signed because there is a very signature. But I want to map the zero

29:15.000 --> 29:22.840
instrumentation for tracking from request to respond time just to know how much it took and I would

29:22.920 --> 29:26.600
put it to some database to map it.

29:27.400 --> 29:32.120
If I understand your question correctly, you asking if you can map the request and response

29:32.120 --> 29:40.280
without propagating context. Is that right? Yes, with a few caveats. Only in the say, I would say

29:40.280 --> 29:46.840
only in the same node or because Bailey's what the agent is what keeps the context. So it

29:47.320 --> 29:53.960
correlates the request and response. It has the data. But if you are shipping a request to a

29:53.960 --> 29:59.960
different, followed or different node or whatever, that has a different agent running there,

30:00.920 --> 30:06.200
then you would need to somehow transmit that information. That's when this comes into play.

30:06.200 --> 30:07.640
If that answers your question.

30:09.480 --> 30:16.440
Yeah, and by the way, can also map the data from different nodes. So we have, for example,

30:16.440 --> 30:26.760
the by by IP headers like reports and all that stuff. So if you have data from different nodes,

30:27.560 --> 30:31.400
yeah, then you need to complex propagation. Otherwise, you need to, as far as I know,

30:31.400 --> 30:34.600
you need that value Yeah, next.