WEBVTT

00:00.000 --> 00:18.880
So now, Martin, it was supposed to be with Paul, but Paul couldn't make it, so Martin

00:18.880 --> 00:23.880
just present the presentation about the Kubernetes AI building themselves.

00:24.320 --> 00:25.880
Thank you very much.

00:25.880 --> 00:27.520
Hello, everybody.

00:27.520 --> 00:31.380
Remember now, I'm like a lecturer up here, if I see you walk and don't, I got a name and

00:31.380 --> 00:32.880
shame you.

00:32.880 --> 00:34.080
No, it's only a joke.

00:34.080 --> 00:36.280
All right.

00:36.280 --> 00:42.000
So we, AI folks, businesses are out there and are trying to leverage the advantage

00:42.000 --> 00:45.000
with AI.

00:45.000 --> 00:51.920
I wanted to key aspects that is being able to build better AI applications so that they

00:51.920 --> 00:54.680
can get the return on investment in AI.

00:54.680 --> 00:58.880
Because AI costs money to run your models, to leverage them, to train, and

00:58.880 --> 01:06.920
make it, et cetera, and for that to be used, they need to see the return on investment.

01:06.920 --> 01:08.400
So my name is Martin Hickey.

01:08.400 --> 01:13.400
I am an open source developer and I work over at IBM, and I'm going to be walking

01:13.400 --> 01:18.280
through, because we've a five minute talk, I'm going to concentrate on two small tenants

01:18.280 --> 01:24.400
for the moment, and you can take it from there and check them out yourself afterwards.

01:24.400 --> 01:31.000
So the two tenants I'm going to look at is, for so on I'm going to look at taking a model

01:31.000 --> 01:35.520
off-the-shelf open model and how do you tune it for your own domain data.

01:35.520 --> 01:39.680
So I'm not looking at your general, general model, and you can ask a different questions

01:39.680 --> 01:41.480
that's been trained on the internet and stuff.

01:41.480 --> 01:46.480
But what about if you need to build an application for your customer or for yourself, and

01:46.480 --> 01:48.480
you need the domain specific on it?

01:48.480 --> 01:54.480
The second one day is, how do we take models and actually use them for to execute complex

01:54.480 --> 01:57.680
workflows?

01:57.680 --> 02:00.160
So start and off with the tune in aspect.

02:00.160 --> 02:05.480
So when we take a model, whatever model that's out there, the open models, it's going

02:05.480 --> 02:11.480
to be trained from everything that's on the internet, and also any synthetic data after

02:11.480 --> 02:15.480
that, because believe it or not, we've used up all the data on the internet at this

02:15.480 --> 02:16.480
stage, okay?

02:16.480 --> 02:22.480
But the important part to this is that if we're going to use it for customers or for

02:22.480 --> 02:28.280
ourselves, we're going to need to put domain data into that, and that's called tuning.

02:28.280 --> 02:33.480
Now you're going to turn around to me and say, rag, what about rag?

02:33.480 --> 02:34.480
Yes.

02:34.480 --> 02:35.980
They're not mutually exclusive.

02:35.980 --> 02:40.280
You can tune your model with domain data, and you can also use rag as well.

02:40.280 --> 02:43.880
And generally, you'll probably use rag for data that changes quite a lot.

02:43.880 --> 02:47.880
You'll put in your domain specific knowledge, and data that's updating quite regularly,

02:47.880 --> 02:50.880
then you're going to use rag, because there's going to be an overhead car in the cross

02:50.880 --> 02:53.880
into a vector database and pull them back to data.

02:53.880 --> 02:59.880
Now the problem is, I suppose, two kind of complications with tuning models are open

02:59.880 --> 03:00.880
models is.

03:00.880 --> 03:05.880
Number one, trying to contribute into a model today is difficult with no model.

03:05.880 --> 03:10.880
So what happens is, instead of building a new version of model, we build a new variant.

03:10.880 --> 03:15.880
Hands up here who will see in the many different variants of lama out there.

03:15.880 --> 03:16.880
Yeah.

03:16.880 --> 03:19.880
So there's a lama for this, a lama for that, and for the different domains.

03:19.880 --> 03:20.880
And that's no one's fault.

03:20.880 --> 03:22.880
It's just the way it is.

03:22.880 --> 03:31.880
The second one is around when you try to tune the model, you need to have this high barrier

03:31.880 --> 03:34.880
of entry, which is having AI knowledge or DPI knowledge.

03:34.880 --> 03:38.880
Wouldn't it be nice to just tune the model with the data you have and not have to worry

03:38.880 --> 03:42.880
about all the complications that go with it.

03:42.880 --> 03:45.880
And that's where in stroke lab comes in.

03:45.880 --> 03:56.880
So what in stroke lab does for you, it's a workflow which allows you to build, to put your data on top of the base model and tune that to create a new version in the model.

03:56.880 --> 03:58.880
So your base model doesn't change.

03:58.880 --> 04:04.880
The data you're putting in or the knowledge you're adding to it is, as I say, data or knowledge.

04:04.880 --> 04:07.880
It's like PDFs, it's marked on, et cetera.

04:07.880 --> 04:13.880
So you don't need a deep knowledge of our understanding of AI for that.

04:13.880 --> 04:23.880
Also, when the new version of the base model comes out, you can train it again because your data is all stored in the taxonomy, which is essentially a binary tree of data.

04:23.880 --> 04:29.880
And every time you go to bring out the new version, you just tune the model again using that.

04:29.880 --> 04:37.880
The other aspect of the data is, you can contribute your knowledge out into the community because it's an open fiber and community that's out there.

04:37.880 --> 04:38.880
Okay?

04:38.880 --> 04:43.880
Now, you may say, okay, we've got private information, we're not putting that out there.

04:43.880 --> 04:53.880
But that's okay too, because you can take this workflow, bring it inside, have your own taxonomy that you build up, and then build versions of your model as you go along.

04:53.880 --> 04:58.880
Our tune your versions of your model.

04:58.880 --> 05:07.880
The second tenant that I mentioned is around using your model, be a large language model, a small language model, whatever.

05:07.880 --> 05:13.880
For something more than user prompts are for just, you know, chat or whatever else.

05:13.880 --> 05:22.880
Look, it's great to ask it to do a poem or do something else or ask it a few questions, but let's make it a bit more powerful, okay?

05:22.880 --> 05:26.880
The way to help you this, it is the agentic frameworks.

05:26.880 --> 05:36.880
What the agentic framework is essentially, it's making your large language model or whatever type of version of model you have to be the brain of the particular application.

05:36.880 --> 05:43.880
So the way I kind of look at it with an agent is, it's like a wrapper around which your model in there.

05:43.880 --> 05:47.880
And then you've a series of flaws or actions you can put in there.

05:47.880 --> 05:52.880
Five minutes left and a five minutes talk, this is good one, oh sorry, there's one minute.

05:52.880 --> 05:55.880
Okay, I'm not allowed to do those jokes, okay?

05:55.880 --> 05:58.880
Sorry, you have to make it, you have to lighten up a lightening talk.

05:58.880 --> 06:01.880
I'm sorry about that, it's very bad joke.

06:01.880 --> 06:12.880
Anyway, so what you want to do then is you're going to put in a series of tools or functions around that, to give them the large language model, more power or more driving it.

06:12.880 --> 06:25.880
So the way I look at it is, is imagine if you have, you want to use this model to be, to provide you with the best route to some location, be it home or somewhere you have to go.

06:25.880 --> 06:36.880
So instead of just saying, right, give me the best route, wouldn't it be great if the model could force the ball, ask Google, Google maps to say, what's traffic like an location at the moment?

06:37.880 --> 06:42.880
But in addition, maybe you could go out and ask the weather, what kind of weather is there at the moment?

06:42.880 --> 06:49.880
But while you're doing that, couldn't it always ask if you have any emergency service APIs in your area or in your country?

06:49.880 --> 06:54.880
So imagine be able to take all that information together to make a decision to give you back.

06:54.880 --> 06:59.880
And that's the key with the gente that I look at and put in your agents and use your model as a brain for it.

06:59.880 --> 07:03.880
So why, to be, B-A-I-R-I-M-B framework?

07:03.880 --> 07:05.880
First of all, it's open source.

07:05.880 --> 07:08.880
But out of the box, it's production ready.

07:08.880 --> 07:10.880
So what do I mean by that?

07:10.880 --> 07:15.880
So what I mean by that is, it's ready for the different errors and real world situations.

07:15.880 --> 07:18.880
Because no longer know where we just ask in the model to do something.

07:18.880 --> 07:23.880
We're asking the model to call external systems to get information back.

07:23.880 --> 07:27.880
And as we all know, if you've ever written an external system, there's always an error.

07:27.880 --> 07:29.880
Something goes down.

07:29.880 --> 07:35.880
And being able to handle that and being able to troubleshoot it and look at the monitoring and the telemetry, that is key.

07:35.880 --> 07:38.880
The second of all is the tools that are on the, what kind of tools are there?

07:38.880 --> 07:41.880
And in this framework, it's got a lot of tools.

07:41.880 --> 07:49.880
For example, if you want to do searches, if you want to do query to SQL, if you want to look at a weather or traffic, it's in there.

07:49.880 --> 07:51.880
And it's building as it goes along.

07:51.880 --> 07:56.880
And there's also some agents out at the box that you can use for text summarization or different things.

07:56.880 --> 08:01.880
The monitoring as well is, you know, if you want to build your agents go and code it away from it.

08:01.880 --> 08:06.880
But if you're not into coding and you want to know code, you can also have that true true to you.

08:06.880 --> 08:13.880
So to just finish up, I put the two links up here because basically we haven't talked about that much.

08:13.880 --> 08:16.880
But check it out, see what you think and give it a go.

08:16.880 --> 08:17.880
Thank you very much indeed.

08:17.880 --> 08:18.880
Thank you.