WEBVTT

00:00.000 --> 00:13.000
I have one, my name is Nune Reish, I work for a tall desk, which is a cold center company

00:13.000 --> 00:16.080
that exists just in the cloud.

00:16.080 --> 00:24.560
We offer contact sensor solutions, basically, and down the road, we use a lot of open sources

00:24.560 --> 00:25.560
as well.

00:25.560 --> 00:31.340
As you may see here, I will speak today about Camille Free Switch RTP engine, mainly Camille

00:31.340 --> 00:38.680
Euryo, some of Free Switches as well, we will get into that, and let's do it.

00:38.680 --> 00:47.520
So I will just try to explain what the problem was in talk desk for this, and we will

00:47.520 --> 00:54.520
do a quick overview of how our global network, I will then go into the details of the

00:54.520 --> 01:02.520
design and architecture behind what we did, and some future work that I think might

01:02.520 --> 01:05.560
still have room to be done.

01:05.560 --> 01:14.320
Okay, so the problem we had was somehow the company decided that we should try to go after

01:14.320 --> 01:20.520
some on-prem contact centers, not exactly as talk desk, and he may believe I could

01:20.520 --> 01:23.920
be able to be able to test for them as a solution.

01:23.920 --> 01:32.200
To do that, and since we are talking about on-prem or even other cloud contact centers

01:32.200 --> 01:42.040
that are PT legacy, or I kind of closed in themselves, we will talk to look on what

01:42.080 --> 01:51.560
were the existing frameworks, standards that we could tap on to actually enable this.

01:51.560 --> 01:58.880
One of the things, and I will focus a lot today on that, is how we enabled the co-pilot

01:58.880 --> 02:09.680
feature that we had already on, let's call it native talk desk solution, by using C-Prec,

02:09.680 --> 02:16.680
with connecting C-Prec with talk desk, with this open source components.

02:16.680 --> 02:23.000
Basically, we also looked at what we call the X-Connect ecosystem, the X-Connect

02:23.000 --> 02:35.840
access system is, group of components that include Camille, RTPA engine, SBC, and we have

02:35.880 --> 02:42.080
some flows already supported in that, but maybe there were missing parts there, to actually

02:42.080 --> 02:49.280
enable this, we tried to identify which were those missing parts, and I could we close

02:49.280 --> 02:58.240
those gaps using official open source, like Camille, RTPA, or Frisic.

02:58.240 --> 03:06.800
So again, by the way, it was to connect external sources through C-Prec with us, make this

03:06.800 --> 03:14.120
global available, work at scale, and we also had a requirement, internal requirement in

03:14.120 --> 03:24.320
talk desk, to keep the AI ecosystem, which is a different vertical in talk desk, as most

03:24.320 --> 03:29.600
simple chess possible, since we didn't want to go, I can increase the effort on changing

03:29.600 --> 03:35.440
much of that, and we kind of achieved that through what we did.

03:35.440 --> 03:43.240
So our global network is basically, so we have everything cloud hosted, as I said, we have

03:43.240 --> 03:49.680
a global presence, more or less like this for C-Connectivity, and some other use cases,

03:49.760 --> 03:57.680
but in this particular situation, C-Connectivity, at an IWRW review of what we use,

03:57.680 --> 04:05.840
basically, we have the SBC component, Camille, RTPA engine, and then we just connect everything

04:05.840 --> 04:14.560
from AWS as EC2s, deploy to the C-Prec side of talk desk, which is other kinds of people

04:14.560 --> 04:26.640
use Kubernetes, EC2s, so it's kind of a mix there, and we get into this one to actually support

04:26.640 --> 04:34.560
C-Prec, so the idea here was, so we got the C-Prec in, we would fit it into Camille, Camille,

04:34.560 --> 04:41.120
we would do some initial work with that, would send it to a new piece, which is that SRS,

04:41.200 --> 04:47.920
the C-Prec server, the C-Prec server would then send it back over Camille into a free switch

04:47.920 --> 04:55.760
instance, and this free switch isn't with AirPin, the participants of the call together,

04:56.880 --> 05:05.440
and reply back, and basically the dialogue will establish, by doing this and using a specific

05:05.440 --> 05:17.360
module in free switch, which is the module we would be able to stream over web socket,

05:17.360 --> 05:23.680
any call happening in free switch to a web socket, which was the interface that the guys

05:23.680 --> 05:32.160
in talk desk were using for initiating a co-pilot instance, then here there are some magic that

05:32.160 --> 05:40.720
I will not go into much detail, but we would then start a session on some kind of web browser

05:40.720 --> 05:46.960
in that on-prem system where the co-pilot will connect with an agent that was already mapped

05:46.960 --> 05:53.440
to an agent in talk desk, as if it was an ativation, and all that will go from there.

05:54.000 --> 06:04.640
Okay, so, in terms of open source components, common free switch, module stream for free switch,

06:05.360 --> 06:15.840
and we started by using Bractio SRS, I have to say that they were pretty ingenious on the proposal

06:15.920 --> 06:25.760
they did for providing an SRS and an SRS capability interactive, so they have two flavors,

06:25.760 --> 06:33.440
one is using free switch, the other is using RTP Engine, actually by implementing that diagram

06:33.440 --> 06:40.560
that I showed before, we support the two together as well, we are now using RTP Engine,

06:40.560 --> 06:49.440
and by leveraging the Janus mode of RTP Engine, we can subscribe like a video room like Janus

06:49.440 --> 06:57.600
session and get the stream of that, so that works also, we decided to go with the module stream

06:57.600 --> 07:03.600
for free switch because the endpoint was already a web socket, we needed that, we could pass

07:03.600 --> 07:08.640
and activate along with it, so it worked better for the time friend that we had.

07:12.160 --> 07:19.120
Okay, some of the community modules that I used for all of these, so this patcher, it's table,

07:19.120 --> 07:28.000
STP, upset, desktop, Johnson, HTTPS, Inc, and some others, but maybe these were the major ones.

07:28.720 --> 07:38.080
So in terms of RTP Engine, we did not change much, so RTP Engine is there for normalizing everything,

07:39.440 --> 07:46.960
we have the Janus mode also if we want to actually stream something off RTP Engine it works as well.

07:47.840 --> 07:57.520
Now, an example of how I basically detect if I am on a separate call in Canary Audio,

07:58.560 --> 08:05.840
so here we are basically tapping into the body of the message and see if it's multi-part mixed,

08:07.280 --> 08:16.880
if it is we continue and then we will see if there is STP part and the recording session method

08:17.280 --> 08:23.760
part, if we actually find the recording session method at the part we will try to see the content

08:23.760 --> 08:30.800
disposition is record session, if it is, we say okay, this is a separate call and then we do some other stuff.

08:30.800 --> 08:38.000
In this case, I'm actually storing in the hash table, the call ID of the original separate call,

08:38.000 --> 08:44.240
I will need that later because everything here is stateless and since I will be showing

08:44.240 --> 08:53.680
a call example in a few moments, but basically I'm keeping this call ID stored to actually when

08:53.680 --> 09:00.800
the call goes through all the components and gets to free switch, I saved the details of free switch

09:00.800 --> 09:10.880
where like the IP of that instance, the ID of the free switch instance itself, so by doing that

09:10.880 --> 09:19.760
and saving that I will later emit an event out of community to via our X connect API to a Kafka topic,

09:20.960 --> 09:27.120
then our components on the other side would be listening for that topic and would be able to

09:27.120 --> 09:34.240
request the stream out of the right instance of free switch through more audio stream and they will get

09:34.320 --> 09:41.280
basically the calls streaming, so this is more or less what I'm doing, this later part here,

09:41.280 --> 09:49.920
we are encoding the recording session method at the end of STPs as best 64, so we can send it in that event,

09:51.120 --> 10:00.480
so that's what this is doing, on free switch we created an extension out plan extension for

10:01.040 --> 10:11.440
the air pin part, this is highly inspired by the work of Draktio, so the guys at Draktio have

10:12.160 --> 10:18.320
kind of a similar simpler approach for this, so when we get the call in free switch,

10:18.320 --> 10:24.800
I'm basically doing some tests if we are getting some requests, you are hyper-emitter,

10:24.880 --> 10:32.000
they are coming from Camille here, you want that particular call, if that matches, we enter here

10:32.000 --> 10:37.840
and then I apply here a Ragex on the STP part to actually get from the STP part,

10:38.560 --> 10:45.840
two things, one is the original CPEC all ID, that I will be then sending in a customer

10:45.840 --> 10:51.840
other back to Camille here, and the other thing that is important to me is the destination you are

10:52.560 --> 11:05.440
the original one, so I can link the original C, SRS call with the reply part on the same SRS instance,

11:06.480 --> 11:17.920
so this is more or less what this is doing, so basically here we are using an external

11:18.800 --> 11:28.480
runtime for the SRS, I don't like it as it is today, it works fine, but some interesting thing

11:28.480 --> 11:35.200
to actually do here would be making Camille, you will do what SRS is doing outside of Camille,

11:35.200 --> 11:41.040
like in a module of some sort that would basically reuse for instance free switch for the rest,

11:41.120 --> 11:47.520
as SRS is doing already, maybe that would be nice, I don't know, maybe we will do that,

11:51.200 --> 12:01.040
okay, yeah that's it, now let me just show you the call example with all this,

12:02.000 --> 12:26.080
if I am able, I don't need it, it works, okay, so, okay, so this is a CPEC invite,

12:27.040 --> 12:35.920
so we have the multipart with STP and recording session metadata, the first thing there,

12:37.360 --> 12:44.640
so for you to know this is Camille, this is the SRS, this is free switch, and this is Camille,

12:44.640 --> 12:55.840
again we will get, okay, so call interest Camille, okay Camille with the dispatcher,

12:55.840 --> 13:07.440
sends it detects that this is CPEC, sends it to the SRS instance, the SRS instance does a first kind of

13:07.520 --> 13:17.680
clean up, it basically slices the STP part and uses the first part of all of the STP to actually

13:17.680 --> 13:25.600
pin the participant, the audio of the first participant that will be bridged in free switch at a point,

13:26.560 --> 13:35.360
okay, so then here it's being sent to free switch, I'm adding some CPEC parameters in the top,

13:35.920 --> 13:43.600
as you see here in the STP, I'm actually adding some extra attributes with the original call ID and the

13:43.600 --> 13:54.080
original destination URI, and currently in this particular case with X, so that this is basically

13:54.080 --> 14:02.400
reused later, so basically free switch here processes that and adds all those custom

14:02.400 --> 14:08.640
methods that talk that something, so I'm basically, this is important for that event that I talked about,

14:09.600 --> 14:16.160
so the core UID of the free switch is the instance that processed the call, the IP of that call,

14:16.160 --> 14:24.560
the AWS region, where the instance is, and I'm actually adding as well the original call ID and the

14:24.560 --> 14:35.760
CPEC destination URI encoded and then Camille will de-encoded it later, so it gets into this,

14:36.720 --> 14:46.320
now the answer, and the answer that is coming from SRS, is also treated by SRS sent to

14:47.120 --> 14:53.360
free switch, free switch, free switch, this is the call together, everything goes back,

14:54.000 --> 15:03.280
until we send the 20th OK reply off the original CPEC call back, when it gets here, I created some

15:03.280 --> 15:10.240
logic in Camille URI that basically when I detect this step, I emit the event with all the

15:10.240 --> 15:17.680
logic that I, with all the things that I collected, namely the free switch details that will then be

15:17.680 --> 15:30.320
used for asking the stream of the audio. OK, that's it, the time was a bit short, I could eventually

15:30.320 --> 15:37.520
show you some code in Camille URI, I showed you some examples already, but time is up, sorry guys,

15:37.520 --> 15:54.320
I hope this was interesting enough, so questions if you want, you can also find me on MasterBone if you

15:54.320 --> 16:02.320
want to engage.

16:04.320 --> 16:11.520
Completely say it was yes, so basically I'm using this patch by setting every time I want an instance

16:11.520 --> 16:20.400
of an SRS runtime, it gets added to this patch, by itself, and then the details to actually keep

16:20.400 --> 16:26.400
the state that is important to me are transferred back and forth in the matches itself and

16:26.400 --> 16:32.400
reprocessed in the places I need. So yeah.