WEBVTT

00:00.000 --> 00:11.480
Right, hello, so I'm Robin, this is Timo, and we have a half-shot joining us virtually,

00:11.480 --> 00:17.040
we're all from Elements VoIP team, and we're here to tell you all a bit about matrix RTC

00:17.040 --> 00:22.160
and how to build real-time applications on top of matrix using this.

00:22.160 --> 00:29.720
So you all probably know what matrix is, but it's worth restating that today matrix is

00:29.720 --> 00:35.200
primary use case is instant messaging, but matrix really has ambitions to cover all kinds

00:35.200 --> 00:41.160
of exotic communications use cases, starting with voiceover IP, but then extending possibly

00:41.160 --> 00:47.040
to virtual reality augmented reality, the Internet of Things.

00:47.040 --> 00:53.400
So let's take a moment to talk about what matrix is good at today and what building

00:53.400 --> 00:55.320
blocks it provides.

00:55.320 --> 01:02.360
So today we have primitives that are good at enabling these mere real-time use cases like instant

01:02.360 --> 01:03.840
messaging.

01:03.840 --> 01:09.200
And so some of the primary things that it gives us are when you're in a room, you have

01:09.200 --> 01:14.560
a room timeline which represents the sequence of events that have happened there and

01:14.560 --> 01:21.240
due to decentralization, this isn't just a strict order, but you actually get an aciclic

01:21.240 --> 01:28.520
graph and so you have to think about what happens when the order of events isn't clear.

01:28.520 --> 01:36.960
But if you want to use this for really streaming data quickly in real-time, well you can

01:36.960 --> 01:43.320
get subsequent latencies perhaps if you're exchanging data locally just on one home server

01:43.320 --> 01:49.000
or in some cases over Federation, but often over Federation you're going to start seeing

01:49.000 --> 01:57.640
latencies that go beyond the one-second margin and especially if you want to be like doing

01:57.640 --> 02:02.840
things like streaming, voice and video, putting this into a room timeline event really

02:02.840 --> 02:08.720
doesn't make sense because the room timeline is for things that are meant to persist indefinitely.

02:08.720 --> 02:15.240
And the other important thing that matrix gives us is room state which is a summary of events

02:15.240 --> 02:20.040
that have happened in the room as basically a key value store.

02:20.040 --> 02:25.200
And so these are great for building instant messaging, but what's missing is if you want

02:25.200 --> 02:30.440
to do things like voice like whiteboards, we're missing a way to actually communicate

02:30.440 --> 02:36.640
in real-time with low latency, low jitter and exchange data that might be more a femoral

02:36.640 --> 02:40.280
in nature.

02:40.280 --> 02:45.680
And right, so let's take a brief moment to go over the ways that we've tried to do voice

02:45.680 --> 02:52.760
over IP in matrix in the past for the voice and video call use case.

02:52.760 --> 02:58.840
So ever since there's been the idea that we could just use matrix as a signaling layer

02:58.840 --> 03:07.120
for WebRTC and establish peer-to-peer WebRTC connections for doing these one-on-one calls.

03:07.120 --> 03:13.880
And what you do is you just take STP offers and you'd exchange them in the room timeline.

03:13.880 --> 03:20.040
And this is great because it automatically takes advantage of matrix's encryption primitives.

03:20.040 --> 03:24.640
And so you pretty easily get an encrypted call working that way.

03:24.640 --> 03:28.320
But this just doesn't, it's not a group call.

03:28.320 --> 03:33.560
And so when we thought about how are we going to make group calls where it can matrix the

03:33.560 --> 03:37.560
first idea was maybe we can just use some off-the-shelf solution that already scales really

03:37.560 --> 03:42.880
well to hundreds of participants such as free switch or jitzy.

03:42.880 --> 03:50.360
And so jitzy in particular was integrated as an I-frame as a widget that would just be rendered

03:50.360 --> 03:55.040
within your matrix client.

03:55.040 --> 04:00.120
And then we thought about, well, what if we want to make this a more native experience,

04:00.120 --> 04:06.160
actually integrate it in with matrix and try to get, try to bring back some of leveraging

04:06.160 --> 04:08.760
matrix's encryption.

04:08.760 --> 04:13.520
So that was why when we were working on the first iterations of matrix RTC, we thought

04:13.520 --> 04:19.520
about maybe you can just take this simple one-on-one WebRTC setup and actually just generalize

04:19.520 --> 04:22.680
that to connect to more peers.

04:22.680 --> 04:28.640
And so you could have group calls where you're just routing your media to your peers

04:28.640 --> 04:32.560
in a full mesh configuration, every peer connects to every other peer.

04:32.560 --> 04:38.560
And so this works to get group calls and they're encrypted, you're communicating, you're

04:38.560 --> 04:44.720
just communicating your STP over two device messages that are in private channels and they're

04:44.720 --> 04:46.800
encrypted with home.

04:46.800 --> 04:51.400
But it doesn't scale of course because most internet connections that users are on, they're

04:51.400 --> 04:59.480
just not going to have the upload bandwidth to be able to send your local video 10, much

04:59.480 --> 05:02.880
less, 100 times over to all the other participants.

05:02.880 --> 05:09.840
So what we really needed is something that was as ready to ship as an off-the-shelf solution

05:09.840 --> 05:15.400
like Jitzi and that could scale as well as having a conference server that can sit in

05:15.400 --> 05:18.960
the data center and have a better internet connection.

05:18.960 --> 05:23.080
But also a solution that we can still tightly integrate with matrix so that we can check

05:23.080 --> 05:25.040
this encryption box as well.

05:25.040 --> 05:32.160
So we're really looking for an STK and we kind of found exactly what we were needing in

05:32.160 --> 05:33.480
the form of life kit.

05:33.480 --> 05:39.480
So life kit is a company that's building open source solutions based on WebRTC for enabling

05:39.480 --> 05:42.760
all kinds of real-time applications.

05:42.760 --> 05:48.360
They develop and as a few a selective forwarding unit based on Pion, which is an open

05:48.360 --> 05:54.120
source WebRTC framework written in Go, that was exactly what we were considering using

05:54.120 --> 06:02.600
when we thought briefly about writing our own SFU that would be more native to matrix.

06:02.600 --> 06:07.440
So it's like if we don't want to reinvent the wheel, we should just use life kit.

06:07.440 --> 06:13.320
And so life kit provides the concept of rooms and participants and for matrix RTC, we

06:13.360 --> 06:18.000
want to have the concept of sessions that you can establish and advertise in matrix and

06:18.000 --> 06:21.680
then just session members, the ability to join these things.

06:21.680 --> 06:27.920
And so kind of just match up the IDs and this is how we set about integrating these two.

06:27.920 --> 06:34.760
Right, and then think about well, how are we going to advertise exactly in matrix that

06:34.760 --> 06:38.080
you want to be joining a column life kit.

06:38.080 --> 06:40.360
So how do you advertise that you're in a call?

06:40.360 --> 06:44.480
Well, here it makes sense to just use the room state as opposed to in the past, we've

06:44.480 --> 06:52.240
been using room timeline events for things like the one-on-one WebRTC calls.

06:52.240 --> 06:55.680
It makes more sense to use room state because then it's very easy for clients to look

06:55.680 --> 06:56.680
up.

06:56.680 --> 07:02.280
They always have a copy of the most recent values of room state, right.

07:02.280 --> 07:06.680
And then for if you want to ring someone else's device, well, there's another matrix

07:06.680 --> 07:12.160
primitive, which just thing that matrix provides that makes a lot of sense here, which

07:12.160 --> 07:17.720
is to use an event that contains mentions or contains intentional mentions.

07:17.720 --> 07:23.240
And this will push a notification and then you have data such as just what session you

07:23.240 --> 07:29.360
are joining in the first place as some repeated information about the member that is joining

07:29.360 --> 07:30.600
in particular.

07:30.600 --> 07:35.320
And then the active life kit focus that she's going to be joining out for.

07:35.320 --> 07:41.960
And when she wants to send a notification out to perhaps ring other people's devices, this

07:41.960 --> 07:43.880
is what that looks like.

07:43.880 --> 07:50.120
She's just sending an MRTC notify event, which contains intentional mentions.

07:50.120 --> 07:56.440
So just by setting this room true, this is automatically going to be pushed as a push notification

07:56.440 --> 07:59.400
to everyone else in the room.

07:59.400 --> 08:05.120
And then finally, when you want to leave the call, that's another simple state event update.

08:05.120 --> 08:13.000
You just set your member object to an empty object.

08:13.000 --> 08:14.000
Right.

08:14.000 --> 08:19.320
So those were some of the fundamentals of matrix RTC with the exception of encryption, which

08:19.320 --> 08:21.080
kind of deserves its own talk.

08:21.080 --> 08:23.880
And so I'm not going to attempt to cover that here.

08:23.880 --> 08:27.840
There's a great presentation from TMO at the Matrix Conference in 2024.

08:27.840 --> 08:31.360
If you want to go look up the details of that.

08:31.360 --> 08:36.360
But basically, we want Matrix RTC to be a fully featured call solution with all the bells

08:36.360 --> 08:44.240
and whistles that you might need for creating a consumer grade video calling experience.

08:44.240 --> 08:54.680
And so now we're going to see the video output going to work.

08:55.560 --> 09:10.280
Perhaps I think this is not seated properly.

09:10.280 --> 09:11.280
Okay.

09:11.280 --> 09:16.600
We're going to have on the one.

09:16.600 --> 09:17.600
Yes.

09:17.600 --> 09:18.600
Yes.

09:18.600 --> 09:19.600
Yes.

09:19.600 --> 09:30.240
We can do here.

09:30.240 --> 09:31.240
Okay.

09:31.240 --> 09:32.240
Right.

09:32.240 --> 09:38.680
So now we have half shot joining us live and he is going to demonstrate a little

09:38.680 --> 09:46.480
bit about what it looks like for these raise hand and reaction features that we've

09:46.480 --> 09:53.680
recently added to the product or to element call, which is a video calling application

09:53.680 --> 09:59.080
implementing matrix RTC, what those things look like.

09:59.080 --> 10:00.080
Right.

10:00.080 --> 10:01.880
I suppose we don't have the slides up.

10:01.880 --> 10:02.880
So you can ask him.

10:03.200 --> 10:09.880
Can you, can you, can you share the slides?

10:09.880 --> 10:12.880
Yes.

10:12.880 --> 10:14.880
Who is the kind of person among the slides?

10:14.880 --> 10:15.880
Okay.

10:15.880 --> 10:18.880
Next slide.

10:18.880 --> 10:19.880
Right.

10:19.880 --> 10:28.280
So we have this really great contribution from Milton of Nordic earlier in 2024 of adding

10:28.280 --> 10:33.480
the ability to raise your hand in element call and the way this works is really cool.

10:33.480 --> 10:37.520
It's just using a reaction event in matrix.

10:37.520 --> 10:42.320
So we're just reusing that same thing that matrix gives us and you relate it to your

10:42.320 --> 10:47.120
call membership event and then we render it in the call interface.

10:47.120 --> 10:52.040
And next slide.

10:52.040 --> 10:54.600
This is what that event actually looks like.

10:54.640 --> 11:04.960
We have the relation and the key, I suppose, yeah, it's just a hand and yeah, it's just

11:04.960 --> 11:06.320
the standard reaction.

11:06.320 --> 11:08.320
Next slide.

11:08.320 --> 11:09.320
Right.

11:09.320 --> 11:13.880
Then we also wanted to add emoji reactions to element call as well.

11:13.880 --> 11:19.040
Now these are a little different from normal reactions in matrix because you might

11:19.040 --> 11:23.960
want to send the same emoji over and over again, whereas a reactions you can kind of just

11:23.960 --> 11:28.360
add and remove them from an event, you can't send the same reaction twice.

11:28.360 --> 11:35.560
So we ended up creating a custom event type for this, which we will hopefully go back

11:35.560 --> 11:39.120
and spec some time in the future.

11:39.120 --> 11:43.240
They look like when they're rendered and they have sounds which is just not playing.

11:43.240 --> 11:47.720
But yeah, next slide.

11:47.960 --> 11:53.760
This is showing that they are completely normal.

11:53.760 --> 11:58.360
These aren't the standard reaction events, but it's using completely normal relations

11:58.360 --> 11:59.640
and that ticks.

11:59.640 --> 12:03.760
So it's just referencing your membership event and then you send the emoji.

12:03.760 --> 12:05.760
OK.

12:05.760 --> 12:06.840
Thank you, half shot.

12:06.840 --> 12:09.960
I think we're going to speed things along.

12:09.960 --> 12:19.960
Let's see how quick this switch knows.

12:19.960 --> 12:23.400
I think it is, right?

12:23.400 --> 12:25.200
Is it doing something?

12:25.200 --> 12:26.200
There we go.

12:26.200 --> 12:27.200
OK.

12:27.200 --> 12:29.200
The guy can just move it over.

12:29.200 --> 12:30.200
Right.

12:30.200 --> 12:32.200
It's this that needs to be moved over.

12:32.200 --> 12:33.200
OK.

12:33.200 --> 12:35.200
There we go.

12:35.200 --> 12:37.200
And here.

12:37.200 --> 12:38.200
Perfect.

12:38.200 --> 12:39.200
Right.

12:39.200 --> 12:44.240
So as I was saying, Matrix RTC has ambitions to be far more than just voice over IP.

12:44.240 --> 12:48.040
That's why we refer to these as sessions rather than calls.

12:48.040 --> 12:52.560
But I wanted to highlight a couple of the ways in which Matrix RTC aims to be extensible.

12:52.560 --> 12:58.240
So we have this application field when you're joining a session that you can specify

12:58.240 --> 13:04.760
exactly what kind of data you expect to be communicating over the RTC session in the first place.

13:04.760 --> 13:10.160
You might use this to, you can chuck in any names by any names based identifier and use

13:10.160 --> 13:14.280
this to build a whiteboard experience, a word processor, what have you.

13:14.280 --> 13:19.840
And you also might want to consider like integrating or we might want to integrate a third

13:19.840 --> 13:22.000
party call back into the future.

13:22.000 --> 13:28.960
And so the way you might do this is just changing up the type of the active focus.

13:28.960 --> 13:34.160
And this is how you might integrate or just signal that you're using HITZY focus rather

13:34.240 --> 13:36.240
than a live kit focus.

13:36.240 --> 13:37.240
Right.

13:37.240 --> 13:46.360
So I take over now and then starting with saying that this is something we, like if you followed

13:46.360 --> 13:49.880
along the Matrix RTC project, we've advertised since a long time.

13:49.880 --> 13:56.800
Like this extending with different kind of applications they all RTC with the same architecture.

13:56.800 --> 14:01.480
And now we are finally at a state where we can, where we are confident in our current current

14:01.480 --> 14:05.480
SDKs and we can show a demo on how this actually done.

14:05.480 --> 14:11.640
So what will come next is a demo where we build a Matrix RTC application with the current

14:11.640 --> 14:17.400
SDKs which does all the specified things send a state of end into the room, connect to

14:17.400 --> 14:23.400
the live kit as a few, but does something other than calling.

14:23.400 --> 14:27.840
So I'm, again, in this, like, slightly odd situation where my screen is extended.

14:27.920 --> 14:33.360
So I do my best, but the mouse might be, like, a little less coordinated than what would

14:33.360 --> 14:36.080
be if it's on the same screen.

14:36.080 --> 14:47.040
So what I've trapped here is, sorry, is a project that is basically just a matrix widget.

14:47.040 --> 14:52.880
This is cool because then I don't need to care about encryption and, like, authentication.

14:52.880 --> 14:58.640
So I'm, in one room, like, this M room with two accounts, it's actually the same account

14:58.640 --> 15:02.880
in this case, but two different devices might also could have used different accounts.

15:02.880 --> 15:04.480
We'd make any difference.

15:04.480 --> 15:09.480
And in this widget, I'm now starting a matrix RTC session.

15:09.480 --> 15:14.600
So I'm joining the RTC session, element web should detect this as, oh, there is an ongoing

15:14.600 --> 15:15.600
session.

15:15.600 --> 15:18.760
Now things, it's a call, but it's a different type.

15:18.760 --> 15:23.840
And then it will automatically build up a RTC channel over life kit.

15:23.840 --> 15:26.520
So this is all just matrix RTC infrastructure.

15:26.520 --> 15:30.120
There's no back and fronting on my computer, and we'll have, like, a little demo of, like,

15:30.120 --> 15:31.880
a real time experience.

15:31.880 --> 15:35.040
So I start here first, it says it's connecting.

15:35.040 --> 15:38.200
And now it's connected, and we can see this client is actually detecting.

15:38.200 --> 15:39.840
There's something really time going aside.

15:39.840 --> 15:43.080
It's a little annoying because it's doing both, like, syncing the coordinates.

15:43.080 --> 15:47.120
And so we have the name in the background.

15:47.120 --> 15:52.120
So yeah, you can really tell us just, like, a tiny demo, which was done in more hours

15:52.120 --> 15:53.760
than days.

15:53.760 --> 15:55.040
And then we have an maximize button.

15:55.040 --> 16:06.400
So we really can do all the things, like, write text, put it into a box, and edit it,

16:06.400 --> 16:11.480
and then move around the frame, if I would remember how to do this.

16:11.480 --> 16:15.920
I don't apparently, but oh, yeah, there it is.

16:15.920 --> 16:17.360
So you can do all those kind of things.

16:17.360 --> 16:21.200
And they all synced in this, like, smaller frame.

16:21.200 --> 16:24.360
And then if we're done, we can leave on this side.

16:24.360 --> 16:27.400
And minimize here, okay, that's, like, a little broken.

16:27.400 --> 16:31.840
So we would look better like this, minimize here, and leave on this side.

16:31.840 --> 16:35.120
And it will detect that the RTC session is over.

16:35.120 --> 16:39.280
I should, like, clean up this frame also a little bit, but that's basically our

16:39.280 --> 16:43.840
minimal product, how we can do, or any of you could do with just a couple of lines,

16:43.840 --> 16:46.960
a RTC app, or you don't need to care about any back-end implementation

16:46.960 --> 16:51.360
you have an entry-end encrypted with an asterisk, or a little bit of work on this side

16:51.360 --> 16:56.960
needed, and where you have an entry-end encrypted real-time experience for, yeah, anything.

16:56.960 --> 17:00.000
So this will be on GitHub eventually.

17:00.000 --> 17:01.960
I think I will just do a swim announcement.

17:01.960 --> 17:06.320
So if you are interested in such a thing and want to build one yourself,

17:06.320 --> 17:09.960
you should be possible in the next couple of weeks when everything is published.

17:09.960 --> 17:10.960
Cool.

17:10.960 --> 17:12.960
So then I give over to you again, right?

17:13.040 --> 17:14.920
Yeah, there you have it.

17:14.920 --> 17:18.000
And I try to get this nice again.

17:18.000 --> 17:21.920
Custom real-time experience built on matrix.

17:21.920 --> 17:29.200
OK, and now with whatever amount of time we have left, I would like to just go through

17:29.200 --> 17:35.160
a few things that we have in mind as possible next steps for a matrix RTC.

17:35.160 --> 17:40.600
So one question is like, where do you put metadata about a call,

17:40.600 --> 17:44.280
like such as just which life-it-instance to connect to?

17:44.280 --> 17:49.480
We previously were putting this in a state event that just describes information about

17:49.480 --> 17:54.280
the whole call, but then because of the ownership issues, we switch to a situation

17:54.280 --> 17:59.960
where everybody proposes their own life-kit instance and then clients are just running

17:59.960 --> 18:03.720
a simple algorithm that is selecting whatever the oldest membership is

18:03.720 --> 18:06.440
or converging on their life-kit instance.

18:06.520 --> 18:13.480
But in order to align better with our desire for this to really be a federated solution,

18:13.480 --> 18:20.680
we might want to enable clients to just publish their media on their home server's own

18:20.680 --> 18:27.080
life-kit instance and then have everybody in the call connect to all of the life-kit instances

18:27.080 --> 18:34.360
involved and fetch media from them so that you're really, you're really never converging

18:34.360 --> 18:39.320
on a single instance, but if you're working across Federation, you're going to have multiple

18:39.320 --> 18:41.880
servers involved.

18:41.880 --> 18:47.240
Another thing that I've been spending a little bit of time thinking about is how to do call

18:47.240 --> 18:50.280
recording in matrix RTC.

18:50.280 --> 18:57.080
Something interesting is that there's no solution out there today that I'm aware of for

18:57.080 --> 19:02.520
video calling that tries to do both into end encryption and also call recording without

19:02.600 --> 19:03.800
just breaking that encryption.

19:03.800 --> 19:09.960
The usual way is that the call will start encrypted, but then as soon as someone hits the record

19:09.960 --> 19:16.120
button, the server is basically going to join a recording bot to your call, which is then of course

19:16.120 --> 19:19.160
breaking the encryption and just exporting all the media to the server.

19:20.120 --> 19:24.120
And this is kind of nice because you can get like AI transcripts and summaries this way,

19:24.120 --> 19:29.000
but what if we actually had a way that was more private?

19:29.000 --> 19:34.120
So of course the manual way of recording is just like ask your coworker like hey can you

19:34.120 --> 19:39.480
record this call, grab a recording of the screen and then upload it somewhere.

19:40.520 --> 19:46.360
But perhaps we could make this somewhat more of an automated process, just have a button

19:46.360 --> 19:51.800
where you click it and then it sends some signal into the matrix room saying hey I'm recording

19:51.800 --> 19:57.320
this call so that if other people try to record they'll know okay so somebody already has it covered

19:58.280 --> 20:04.440
and then your client will automatically download all the streams and save them somewhere

20:04.440 --> 20:07.800
and then upload it automatically back to the matrix server.

20:07.800 --> 20:12.760
That could be a relatively simple way of getting recording, working with end-to-end encryption,

20:12.760 --> 20:18.280
but there's perhaps an even more interesting way which is what if the server was still

20:18.280 --> 20:23.000
responsible for the recording so it doesn't matter if like any client crashes or leaves the call

20:23.000 --> 20:29.800
or gets disconnected except the server isn't actually joining the call and decrypting all the

20:29.800 --> 20:36.200
streams but rather the servers like recording but is still just sitting on the outside receiving

20:36.200 --> 20:43.160
the encrypted media and saving that to disks somewhere and then after the call participants can

20:43.160 --> 20:50.680
come back if they have the encryption key, receive the encrypted data and yeah just decrypt it so

20:50.760 --> 20:56.440
the server has never actually seen it still and here is how that might work within live kits

20:57.160 --> 21:02.680
architecture so live kit not only provides the as a few component it also provides these

21:02.680 --> 21:09.320
ingress and egress components so I tried briefly over a weekend to get this working to have a

21:09.320 --> 21:15.000
little demo unfortunately they're ended up being a missing piece but the idea is you can take like

21:15.080 --> 21:22.280
live kits egress service just get it to start streaming the encrypted as frame data out and save

21:22.280 --> 21:29.080
it to disk using gstreamer and you have some like lightweight recording and playback service that

21:29.080 --> 21:36.040
the user communicates with and that just orchestrates all of this and then for playback you

21:36.040 --> 21:40.440
take the data on the disk and you'd stream it back in through the live kit ingress using

21:40.440 --> 21:47.560
a combination of gstreamer and call whip it's this interesting thing and then that's going to

21:47.560 --> 21:53.480
create a virtual participant which then you could use room state for whenever you need just lightweight

21:53.480 --> 22:01.400
mutable fields with something close to like last right wind semantics but another thing that I

22:01.400 --> 22:07.400
think will become important if you're trying to build more complicated applications is going

22:07.400 --> 22:13.800
to be relations currently you can think of like when you're reacting to another event that the

22:13.800 --> 22:19.880
reactions to this event are some kind of mutable set and so mutable sets I think end up

22:19.880 --> 22:26.920
be quite useful for building more general applications and so in the example of like a word

22:26.920 --> 22:34.600
processor of a word processor you might take a suitable CRTT for doing word processing and then

22:35.400 --> 22:40.760
any images and diagrams that the users create while using this you store those in the media repo you

22:40.760 --> 22:48.360
get references to them and then metadata about the document you could store in room state because

22:48.360 --> 22:52.520
this is stuff that's not super critical if like two people are editing at the same time

22:53.240 --> 22:58.200
one of their edits could just overwrite the other and that's kind of fine but then edits to the

22:58.200 --> 23:04.680
word document you might aggregate these by relating them to some kind of startup document event

23:04.680 --> 23:11.400
and this is nice because then clients have a way to go and ask the server give me all of the edits

23:11.400 --> 23:17.160
that are related to this document and then you have a way to sort of aggregate them after the fact

23:17.160 --> 23:21.880
without having to page through the entire timeline the entire history of that room

23:22.840 --> 23:29.240
right and then finally one thing that we're thinking about is if we want to build more of these

23:29.240 --> 23:34.600
applications that are just VoIP we really need we're really starting to be something like a matrix

23:34.600 --> 23:40.840
RTC SDK and so things that an SDK could be well suited for is like simplifying the connection

23:40.840 --> 23:45.880
to live kit this currently takes some boiler plate code to get working because you have to

23:45.880 --> 23:56.760
fetch some tokens and whatever but then just also the ability to have a more abstracted data model

23:56.760 --> 24:02.920
for how you exchange data in real time and then persist it could be nice and so having some

24:02.920 --> 24:10.600
way to save and restore arbitrary documents from a matrix room using this sort of combination of

24:10.600 --> 24:17.080
room state and relations could be really interesting and perhaps we could even integrate this with a

24:17.080 --> 24:24.520
one size fits all CRDT such as auto merge I don't know but I think regardless the future is

24:24.520 --> 24:32.280
right for building all sorts of real time experiences on top of the matrix so thank you all for tuning in

24:40.600 --> 24:45.400
as there is no follow up talk there's maybe some time for questions are there some questions

24:46.200 --> 24:50.680
right yeah so you're sure that they move with a picture or extension or that's not called

24:50.680 --> 24:58.840
so I'm sure this is packed as a model so basically oh yeah that's what we do always so

24:59.400 --> 25:05.000
the question was in the demo which I've shown um if the things which we're seeing in that demo

25:05.080 --> 25:11.480
respect and I mean they're different iterations to respect so there is like have their proper

25:11.480 --> 25:16.680
MSC's been written which means that people like properly wrote out what they are doing and don't

25:16.680 --> 25:21.240
run them stuff in the background and then there is actually making it into the matrix back and

25:21.240 --> 25:27.640
making it into the matrix back like none of the voice stuff is is there yet so that's like a longer

25:27.640 --> 25:34.600
procedure but for like the format of the state event what clients have to provide to understand it

25:34.600 --> 25:39.800
and for how to connect through the life kit as a fuse that is all written up in MSCs there are

25:40.360 --> 25:46.600
a couple of things how we do the idea of a membership which is like currently in discussion so

25:46.600 --> 25:52.840
there will be updates to the MSCs which is like a normal MSC process but yeah that's a current state

25:52.840 --> 25:59.000
so like all this could be understood by reading through MSCs but it's not yet in the spec so

25:59.080 --> 26:04.200
clients aren't enforced to support it um like you will have lots of clients which don't support it

26:04.200 --> 26:09.080
but um element web there's almost the same thing then the MSCs accept for this like membership

26:09.080 --> 26:18.600
identifier thing which is like a super recent change yeah and uh like so so you credit call nowadays

26:18.600 --> 26:24.360
power by announcing your presence can inhibit any specific way I credit it doesn't mean there are

26:24.440 --> 26:31.080
what ways they're to decline it for but so we call you is there a way to say no for it and then

26:31.080 --> 26:36.440
kind of stop the reading actually yeah this is this is something that the question would be

26:36.440 --> 26:43.240
right so the the question is um we can advertise you know starting a call but um do you know

26:43.240 --> 26:50.280
how to decline a call say that you you're rejecting a call um right this is one of the things

26:50.360 --> 26:58.200
that's currently still missing from our specification right now but there are ideas floating around

26:58.200 --> 27:05.000
for this like the primary one would be so um the ringing comes from the the notify event

27:05.560 --> 27:12.120
rather than the the state event where you're joining the call so one option would be the receiving

27:12.200 --> 27:21.640
participant just takes uh a reaction or some other relation and yes as thumbs down or uh no or whatever

27:21.640 --> 27:31.400
and uh attaches that to the notification event that could already work right um you were talking about them

27:31.400 --> 27:40.360
and matrix RTC SDK and I was thinking one way to expose that would be having a matrix RTC

27:40.360 --> 27:50.040
what which could join the call and not really going to the call so all the basic coming of me

27:50.920 --> 27:59.240
the rest thing so that it's a full participant it's just not a human and thanks so that would be

27:59.240 --> 28:08.280
one way to codify what we're thinking that's a lot right right so I suppose um this is more of a

28:08.280 --> 28:18.840
comment yeah just about um rather than building an SDK around um creating clients that interact

28:18.840 --> 28:26.040
with matrix RTC in a special way that it could be um interesting as well to have a way to create

28:26.120 --> 28:35.400
bots that join just calls not necessarily other experiences and yeah I do new things in them

28:37.000 --> 28:41.400
right I think this is something where it certainly interested in having at

28:42.520 --> 28:53.320
some point in the future is it yeah oh wait right so um on the just how we're right right

28:53.880 --> 29:00.920
the question how we're managing to encrypt that calls um so just on the the transport side

29:01.560 --> 29:07.640
we're using the s-frame standard uh which I I don't know the full details have but um

29:10.760 --> 29:19.160
basically we have what the things we get are um a small field in the trailer to each frame I believe

29:20.120 --> 29:26.760
which indicates the index of a key that you're currently using um so clients are able to

29:27.560 --> 29:34.760
potentially be switching between different keys um during the same call and still have this all be

29:34.760 --> 29:41.000
decrypted seamlessly by the receiving clients since there's information about um when exactly the

29:41.000 --> 29:48.360
participant switched and on the key management side uh this is a part that's also uh still subject

29:48.360 --> 29:55.400
to some change um the solution we have today is just that when you're joining a call um each participant

29:55.400 --> 30:04.360
in the room will go and uh send uh the key that they intend to be using uh into the room as a

30:04.360 --> 30:11.080
timeline event and this is an encrypted timeline event and so we we don't get all the nice properties yet

30:11.080 --> 30:19.480
like perfect forward secrecy uh post post compromise security yet um but we are intending to move

30:19.480 --> 30:25.720
to a solution where keys are not shared just in the room timeline but rather directly uh one on one

30:25.720 --> 30:34.120
with the other call participants by using matrix two device messages and this is the um the solution

30:34.120 --> 30:39.560
where um you really should go and check out team as full talk on the subject to get more the details

30:39.720 --> 30:45.160
about what we might be going with there yeah so basically that that talk then gives you

30:45.160 --> 30:50.200
what the MSE will become so currently we I think there are like multiple sections in the MSE

30:50.200 --> 30:56.040
it will so that's also point which isn't like as far as you want it to be um um one section is about

30:56.040 --> 31:00.760
this like timeline thing and then we have the two device version which will be much more superior but

31:00.760 --> 31:05.560
it's still in the making then I think one last question we had right

31:06.280 --> 31:10.280
probably it may have more feature question but since you showed this mouse moving to the

31:10.280 --> 31:14.120
mouse feature and they have a thought about the feature where when you share your screen

31:14.120 --> 31:18.920
others can basically move their mouse because that's right that would be the perfect tool for

31:18.920 --> 31:23.160
that type of timing that's always the thing that's interesting but I realized this is really hard

31:23.160 --> 31:30.280
on like way length apparently uh so and they have a thought about it and so yeah so a question for

31:30.360 --> 31:37.240
this stream um when we showed the mouse moving around is there also like the opposite direction where

31:38.440 --> 31:42.920
the listener can control the mouse and my computer and actually take control over the computer

31:45.160 --> 31:51.800
okay or just showing a pointer um so this is very much like beyond the matrix RTC

31:51.800 --> 31:58.520
aspect because it's just a client needing to do it and yeah I think like the the question already

31:58.600 --> 32:04.200
contained this is most the operating system security thing and the trend is very much in the

32:04.200 --> 32:09.080
direction that this is less common so operating systems are strict and strict at how much

32:09.080 --> 32:15.720
applications can do with your wind management and mouse control and that could be such a thing

32:15.720 --> 32:22.200
at least drawing something on top but like the the example of the question was like

32:22.280 --> 32:27.880
pep programming and I think they are almost a more interesting solution would be what editors

32:27.880 --> 32:33.480
are currently heading towards where you have like a proper peer programming editor session and

32:33.480 --> 32:38.680
that could of course be done over matrix RTC like they could be a visual studio code extension

32:38.680 --> 32:45.400
joining an RTC session and then doing all the mouse movements and all the edits over RTC

32:45.400 --> 32:50.760
and I think they're already like things that do this I mean like Microsoft ships their own thing

32:51.640 --> 32:57.080
and one could maybe find a way to build a back end on matrix RTC so yeah definitely multiple

32:57.080 --> 33:02.520
and lots of opportunities to use matrix RTC for this kind of secure pep programming session

33:02.520 --> 33:08.440
not sure if like mouse control over screen sharing will be the final thing but super interesting

33:08.440 --> 33:16.840
topic definitely oh yes yeah fast on so you just left it right the multiple projects moving to

33:17.320 --> 33:21.400
as of now for example picker but I think that's how someone from the community

33:23.400 --> 33:28.840
will you be possible to just use their platform are you in this room or do you have permission

33:28.840 --> 33:35.720
to participate in this call there's something one could extend but not as of now and then you get

33:35.720 --> 33:42.200
a bath token from the from the life kit as a few so yeah that would totally work like a

33:42.200 --> 33:48.440
big blue button already installs a Docker container with life kit you could just additionally

33:48.440 --> 33:56.680
at this DWT token service you just give the DWT token service a life kit master token or that

33:56.680 --> 34:03.400
it's called differently but something along those lines and then you basically made your life

34:03.400 --> 34:10.600
kit service for big blue button a matrix RTC back end so yeah that is totally feasible to

34:10.600 --> 34:17.480
basically use the same the same back end solution on both so thank you very much Robin thank you

34:17.480 --> 34:19.480
Timos thank you well

