WEBVTT

00:00.000 --> 00:05.000
Good night, talk, William.

00:05.000 --> 00:07.000
Great.

00:07.000 --> 00:09.000
Hey, everyone.

00:09.000 --> 00:13.000
Great.

00:13.000 --> 00:15.000
Thanks for joining.

00:15.000 --> 00:16.000
Good to see everyone.

00:16.000 --> 00:19.000
I'll be too much yelling at the bar last night.

00:19.000 --> 00:23.000
It's my voice is a little weak, but we'll try to give it a go.

00:24.000 --> 00:30.000
So this talk is building a gented graphical APIs with LLM tool use and knowledge graphs.

00:30.000 --> 00:35.000
You can find the slides, either that QR code or that short link there.

00:35.000 --> 00:36.000
My name's Will.

00:36.000 --> 00:38.000
I work for a company called HyperMode.

00:38.000 --> 00:41.000
We'll talk a little bit about what that is in a second.

00:41.000 --> 00:47.000
But before we dive in, there's a lot of terms in the title of this talk, right?

00:47.000 --> 00:49.000
It ends up being a little buzz wordy.

00:49.000 --> 00:52.000
I didn't really intend that when I wrote it.

00:52.000 --> 00:58.000
But let's maybe level set a little bit and talk some about the pieces here.

00:58.000 --> 01:00.000
So the first one is GraphQL.

01:00.000 --> 01:03.000
How many folks are using GraphQL or have used GraphQL?

01:03.000 --> 01:04.000
Okay.

01:04.000 --> 01:06.000
Maybe that's like half maybe.

01:06.000 --> 01:09.000
So GraphQL is an API query language.

01:09.000 --> 01:12.000
And so we have this strict type system.

01:12.000 --> 01:15.000
We define the types in our API, how they're connected.

01:15.000 --> 01:17.000
This is the graph piece.

01:17.000 --> 01:21.000
And then at query time, the client describes exactly the data they want to bring back.

01:21.000 --> 01:24.000
How to traverse through their application data.

01:24.000 --> 01:29.000
So GraphQL makes this observation that your application data is a graph.

01:29.000 --> 01:39.000
And you get back response that exactly matches the selection set, the pieces of the data graph that you said you wanted to be returned.

01:39.000 --> 01:42.000
I worked on open source GraphQL tooling.

01:42.000 --> 01:49.000
When I worked at a company called Neo4j, when GraphQL first was open source in like 2016.

01:49.000 --> 01:55.000
And that was like a database integration where we were trying to generate a schema from a database.

01:55.000 --> 01:58.000
Generate database queries from a GraphQL request.

01:58.000 --> 02:04.000
And as part of that work, I wrote this book published on by Manning, full-site GraphQL applications.

02:04.000 --> 02:10.000
This talk today is a little different take on how we're using GraphQL at Hyper mode and some of our tooling.

02:10.000 --> 02:13.000
Anyway, so that's the GraphQL piece.

02:13.000 --> 02:16.000
The next piece of this is Knowledge Graphs.

02:16.000 --> 02:22.000
Knowledge Graphs, I think, are having this like, resurgent interest now in the context of AI.

02:22.000 --> 02:33.000
It turns out that if you use a Knowledge Graph to keep track of your data and pass that context to these LOM AI models,

02:33.000 --> 02:36.000
then really improve the results.

02:36.000 --> 02:38.000
So we'll talk a bit about that.

02:38.000 --> 02:39.000
But what is a Knowledge Graph?

02:39.000 --> 02:43.000
There's lots of different definitions of Knowledge Graph out there.

02:43.000 --> 02:46.000
To me, a Knowledge Graph is just an instance of a property graph.

02:46.000 --> 02:54.000
So a property graph is this data model where we have nodes, nodes are the entities, the things.

02:54.000 --> 02:58.000
They have a label like the type, a way to group them.

02:58.000 --> 03:01.000
That's why we sometimes call this the labeled property graph model.

03:01.000 --> 03:07.000
And then we store arbitrary key value pair properties on nodes and relationships.

03:07.000 --> 03:11.000
So here's a very small Knowledge Graph.

03:11.000 --> 03:17.000
So I'm up there somewhere a person named Will employed by Hyper mode.

03:17.000 --> 03:20.000
I'm giving a talk that has a topic, LOMs.

03:20.000 --> 03:23.000
There's another person up there, David Allen.

03:23.000 --> 03:24.000
He have a talk yesterday.

03:24.000 --> 03:26.000
It also had the topic, LOMs.

03:26.000 --> 03:28.000
We both used to work for Neo4j.

03:28.000 --> 03:36.000
So I think one of the important pieces of Knowledge Graph is that there's some canonical representation of the thing.

03:36.000 --> 03:43.000
So for example, we have the topic LOM that's connected to a couple of talks at this conference.

03:43.000 --> 03:48.000
And when we have another talk that's about LOMs, we don't create a duplicate LOM node.

03:48.000 --> 03:51.000
We connect relationships to that one node.

03:51.000 --> 03:54.000
We have a canonical representation of the topic LOM.

03:54.000 --> 03:59.000
I think this is kind of the most important piece of this concept of a Knowledge Graph.

03:59.000 --> 04:05.000
When Google released their Knowledge Graphs, and I want to say like 2012 is part of the Google API.

04:05.000 --> 04:08.000
They talked about this concept of things, not strings.

04:08.000 --> 04:15.000
And I think that's a really good way to reason about Knowledge Graphs.

04:15.000 --> 04:18.000
Okay, so the next buzzword in my title is LOMs.

04:18.000 --> 04:21.000
How many folks are using LOMs to some degree?

04:21.000 --> 04:22.000
Yeah, a lot of folks.

04:22.000 --> 04:28.000
So these are models like Cloud or some of the OpenAI models,

04:28.000 --> 04:33.000
or the Open Source models like DeepSeek, which came out fairly recently,

04:33.000 --> 04:37.000
or LOMA, which is made available by meta.

04:37.000 --> 04:43.000
And the basic way to interact with these, we give the model some prompts and LOMs predicts text.

04:43.000 --> 04:47.000
So they're just predicting, given your prompt and everything I know,

04:47.000 --> 04:50.000
like what are the most likely next strings of text?

04:50.000 --> 04:53.000
And that's what we get back.

04:53.000 --> 04:58.000
But we can also use models in more sophisticated ways.

04:58.000 --> 05:04.000
And there's this concept of agentic workflows for working with LOMs.

05:04.000 --> 05:07.000
Again, there's lots of different definitions of like, what is an agent?

05:07.000 --> 05:14.000
But fundamentally, I think an agent or an agentic workflow is just when your LOM is making a decision

05:14.000 --> 05:17.000
about the control flow of your application.

05:17.000 --> 05:22.000
This is typically done with what's called tool use or function calling,

05:22.000 --> 05:29.000
where you're defining tools or functions that are available for the LOM to decide to call.

05:29.000 --> 05:35.000
And you describe the type of data that you can get back from calling that function or that tools.

05:35.000 --> 05:40.000
And then the data is passed back along with your original prompt,

05:40.000 --> 05:45.000
back to the LOM again, and you say, hey, this prompts with the new data,

05:45.000 --> 05:46.000
are we done yet?

05:46.000 --> 05:48.000
Or do you want to call another tool?

05:48.000 --> 05:51.000
And you sort of iterate on this until you get to the output.

05:51.000 --> 05:57.000
Sort of like the basic building block I think for this concept of LOM agents.

05:57.000 --> 06:00.000
So at HyperMode, this is a company I work for now.

06:00.000 --> 06:07.000
We build open source tooling to enable developers to build what we call model native apps.

06:07.000 --> 06:09.000
We'll talk about what that is in a minute.

06:09.000 --> 06:14.000
Our main open source projects are Modus, which is a serverless API framework.

06:14.000 --> 06:16.000
This is mostly we're going to talk about today.

06:16.000 --> 06:20.000
We also build and maintain D graph, which is a graph database.

06:20.000 --> 06:22.000
Anyone using D graph?

06:22.000 --> 06:24.000
Of course, a couple of folks.

06:24.000 --> 06:28.000
Badger, which is distributed key value store in Restredo,

06:28.000 --> 06:29.000
which is a cache library.

06:29.000 --> 06:31.000
These are all written in Go by the way.

06:31.000 --> 06:35.000
Go is a big piece of what we build with.

06:35.000 --> 06:40.000
And then all these projects are open source, Apache 2, build, deploy whatever you want.

06:40.000 --> 06:46.000
But the HyperMode hosting platform like aims to be the best cloud hosting experience for these things.

06:46.000 --> 06:49.000
That's how we're supporting those projects.

06:49.000 --> 06:58.000
And so today we're going to talk about Modus, which is our serverless API framework that HyperMode builds and maintains.

06:59.000 --> 07:02.000
We released this. I want to say in October.

07:02.000 --> 07:07.000
So like fairly, fairly new in the world.

07:07.000 --> 07:11.000
So I mentioned this term model native app, like what is that?

07:11.000 --> 07:15.000
Well, we think back to the three tier architecture.

07:15.000 --> 07:17.000
This is how I learned to build apps.

07:17.000 --> 07:19.000
This infrastructure still exists today.

07:19.000 --> 07:24.000
We have a client that talks to an application server that talks to a database.

07:24.000 --> 07:29.000
If we want to scale this up, we need to deploy another instance of our application server.

07:29.000 --> 07:32.000
Put a load balancer in front and the sort of thing.

07:32.000 --> 07:35.000
And like all those pieces still exist today.

07:35.000 --> 07:37.000
And apps still get built exactly this way.

07:37.000 --> 07:42.000
But I think it's the abstractions that developers think about that have changed.

07:42.000 --> 07:45.000
So instead of thinking about application servers,

07:45.000 --> 07:51.000
typically we're just thinking about functions and deploying this sort of serverless environment.

07:51.000 --> 07:58.000
And AI models, and we now want to incorporate us like a first class citizen into the applications we're building.

07:58.000 --> 08:05.000
And we're using different types of databases and APIs that sort of built from from scale for the beginning.

08:05.000 --> 08:12.000
So that's really what we mean by model native app, where the model is sort of a first class citizen.

08:12.000 --> 08:17.000
And we're using more modern frameworks in ways to build and deploy our apps.

08:17.000 --> 08:21.000
Developer experience is really important in this world.

08:21.000 --> 08:25.000
So let's take a look at the developer experience for modus.

08:25.000 --> 08:28.000
So we said this is a serverless framework.

08:28.000 --> 08:32.000
The abstraction that we think of for business logic is a function.

08:32.000 --> 08:34.000
So we write a function.

08:34.000 --> 08:42.000
Now modus uses web assembly to target as many languages we can for development currently.

08:42.000 --> 08:44.000
We support assembly script.

08:44.000 --> 08:46.000
Anyone using assembly script?

08:47.000 --> 08:53.000
A simply script is a type script like language for web assembly.

08:53.000 --> 08:55.000
We also support go.

08:55.000 --> 08:57.000
How many go folks are there?

08:57.000 --> 08:58.000
Okay, cool.

08:58.000 --> 08:59.000
A lot of folks.

08:59.000 --> 09:01.000
And we're adding more language as soon as the Python's coming soon.

09:01.000 --> 09:04.000
Other languages coming soon want to add as many as we can.

09:04.000 --> 09:06.000
So you write your function.

09:06.000 --> 09:11.000
And in the modus SDK, you have connections to data and APIs,

09:12.000 --> 09:14.000
abstractions for working with models.

09:14.000 --> 09:17.000
Your functions compiled to web assembly.

09:17.000 --> 09:24.000
We inspect the types that you've defined and the signature of your functions to generate a GraphQL API.

09:24.000 --> 09:27.000
And then when you query that in point from the client.

09:27.000 --> 09:30.000
Modus because we're using web assembly.

09:30.000 --> 09:38.000
Sort of using this execution plan that maps resolvers to functions that we've defined.

09:39.000 --> 09:44.000
Has this sandboxed invocation specific to that invocation of your function.

09:44.000 --> 09:50.000
So web assembly gives us secure sandbox memory space for each invocation,

09:50.000 --> 09:52.000
which is quite interesting.

09:52.000 --> 09:55.000
If we look at the components of modus for the development,

09:55.000 --> 10:06.000
a lot of this is driven by a CLI that exposes the way for skeleton and new app for working with the SDK

10:06.000 --> 10:11.000
and for compiling and running locally in web assembly,

10:11.000 --> 10:14.000
then for production.

10:14.000 --> 10:17.000
We're typically just looking at the run time.

10:17.000 --> 10:22.000
And again, we can deploy this to hyper mode,

10:22.000 --> 10:24.000
but the zombine source.

10:24.000 --> 10:30.000
We can run this anywhere that we can run web assembly modules.

10:30.000 --> 10:34.000
If we dive into the modus run time architecture,

10:34.000 --> 10:40.000
some interesting pieces here are perhaps pointing out that there are host services,

10:40.000 --> 10:44.000
so there are functions that run outside of the web assembly sandbox.

10:44.000 --> 10:51.000
So these are our connections to models, our database connections, things like this.

10:51.000 --> 10:55.000
We mentioned this concept of the web assembly sandbox,

10:55.000 --> 11:02.000
which gives us these nice memory safe secure sandboxes for each invocation.

11:03.000 --> 11:08.000
Our run time is built on top of the was zero, go web assembly run time,

11:08.000 --> 11:10.000
if anyone's using that.

11:10.000 --> 11:17.000
In the GraphQL piece, we leverage a lot of wonder graphs, go GraphQL tooling

11:17.000 --> 11:21.000
to generate the schema again based on the types that you've defined

11:21.000 --> 11:26.000
and the signature of your function.

11:26.000 --> 11:31.000
Cool, so let's take a look at what using this looks like.

11:32.000 --> 11:36.000
Is that big enough?

11:36.000 --> 11:37.000
No.

11:37.000 --> 11:38.000
Cool.

11:38.000 --> 11:42.000
So to get started with modus, we installed the modus CLI from MPM.

11:42.000 --> 11:46.000
I've already done this, so that shouldn't really do anything.

11:46.000 --> 11:50.000
And then we're going to say modus new.

11:50.000 --> 11:53.000
This is going to create a new modus project.

11:53.000 --> 11:56.000
Like I said, currently support go and assembly script.

11:56.000 --> 12:01.000
It's very similar to type script.

12:01.000 --> 12:10.000
So we will choose that, and this is going to skeleton out a new modus project for us.

12:10.000 --> 12:17.000
And pull down the assembly script SDK and just starting off with a very basic

12:17.000 --> 12:22.000
hello world project.

12:22.000 --> 12:29.000
There are a couple of interesting, well, first we're going to see what files we have.

12:29.000 --> 12:34.000
So a couple of interesting things to look at is modus.json.

12:34.000 --> 12:36.000
So this is our app manifest.

12:36.000 --> 12:41.000
This is where we configure the endpoints we want to generate.

12:41.000 --> 12:47.000
And then we would also add connections to any data APIs or databases here,

12:47.000 --> 12:52.000
or models that we would want to invoke in our app.

12:52.000 --> 12:59.000
And we can use local models with hosted by HyperMode in this development environment.

12:59.000 --> 13:03.000
So HyperMode can host models off of hugging phase.

13:03.000 --> 13:11.000
We also host models like DeepSeek and Lama, or we can connect to a third party model provider like

13:11.000 --> 13:17.000
Tropic, company I, who just declare that in this modus.json.

13:17.000 --> 13:23.000
And the other thing file is our index.ts, which just has a very basic function.

13:23.000 --> 13:27.000
Hello world, it takes an optional string and returns a string.

13:27.000 --> 13:30.000
So let's say modus dev.

13:30.000 --> 13:35.000
And is going to compile our project.

13:35.000 --> 13:39.000
Generator GraphQL schema and run that locally.

13:39.000 --> 13:44.000
So we'll switch back to, this is the modus API Explorer.

13:44.000 --> 13:49.000
If you think of this as kind of like graphical if you've used GraphQL before.

13:49.000 --> 13:52.000
And we can see here we have one query field.

13:52.000 --> 13:56.000
Say hello that maps to that function that we defined.

13:56.000 --> 13:58.000
If we run it, we get hello world.

13:58.000 --> 14:01.000
We can pass in our optional argument.

14:02.000 --> 14:06.000
So that's like a very basic hello world example.

14:06.000 --> 14:08.000
That's not super interesting though.

14:08.000 --> 14:12.000
Let's see what are some more interesting things we can do.

14:12.000 --> 14:17.000
Well, going back to this concept of the model native app.

14:17.000 --> 14:22.000
I think there are like some building blocks here that we really want to expose.

14:22.000 --> 14:26.000
In modus for building these model natives at native apps.

14:26.000 --> 14:31.000
And this is where models are first class citizen in the application that we're building.

14:31.000 --> 14:33.000
We're not just building a chatbot.

14:33.000 --> 14:40.000
Like we actually want to leverage our LM to make decisions about control flow in our application.

14:40.000 --> 14:44.000
And so there's kind of three like building blocks or three steps here.

14:44.000 --> 14:47.000
The first is tuning data and LMs together.

14:47.000 --> 14:55.000
So this is where like our internal enterprise data is fetched and passed in the prompt in the context of our LM.

14:55.000 --> 15:00.000
To make that output more relevant.

15:00.000 --> 15:10.000
This idea of function calling or tool use, which we talked about at the beginning, where we're defining functions for the LLM to sort of choose the control flow.

15:10.000 --> 15:15.000
Where to fetch data from what APIs to call to resolve our prompts.

15:15.000 --> 15:19.000
And this idea of knowledge graph rag.

15:19.000 --> 15:22.000
Rag is retrieval augmented generation.

15:22.000 --> 15:28.000
Where we're fetching some data and passing that in the context to our LLM.

15:28.000 --> 15:31.000
That's very similar to that first chaining data and LMs together.

15:31.000 --> 15:34.000
That's basically what they call naive rag.

15:34.000 --> 15:39.000
With knowledge graph rag, we can leverage the power of the knowledge graph,

15:39.000 --> 15:46.000
so the context of relationships in our data to again provide more relevant context for our LLM.

15:46.000 --> 15:52.000
So let's look at this first example, chaining data and LMs together.

15:52.000 --> 15:56.000
So we'll take a look at a basic demo app that I've built.

15:56.000 --> 16:04.000
Let's assume that we're running an online blogging website where you type in,

16:04.000 --> 16:09.000
like the text of your blog and then you hit a button and publish it on the internet.

16:09.000 --> 16:11.000
And something that might be useful.

16:11.000 --> 16:15.000
Some AI back feature that would be really nice to add to that would be,

16:15.000 --> 16:21.000
one, could we generate the SEO meta tags automatically based on the content?

16:21.000 --> 16:31.000
This is non-trivial because there's a certain way you want to construct the meta description tag to optimize for SEO and these sorts of things.

16:31.000 --> 16:37.000
And then also, could we generate suggested titles based not only on the content of the blog,

16:37.000 --> 16:40.000
but also in the style of the blog author.

16:40.000 --> 16:47.000
So that's going to be based on data that we have in the database that backs this online blog application.

16:47.000 --> 16:52.000
And so this is an approach called naive rag or retrieval augmented generation,

16:52.000 --> 16:57.000
where we're fetching some data from our internal database that's relevant for this specific query.

16:57.000 --> 17:05.000
The specific prompt that we're going to send to the LLM and passing that along to improve the results.

17:06.000 --> 17:09.000
So let's stop this guy.

17:11.000 --> 17:13.000
I think this is the right one.

17:13.000 --> 17:17.000
So this is the code called this moduspress.

17:17.000 --> 17:19.000
And there's a couple of functions here.

17:19.000 --> 17:22.000
One is generate SEO.

17:22.000 --> 17:25.000
So you can see the prompt we're passing to the LLM,

17:25.000 --> 17:30.000
your SEO expert creating each file meta tag for this content.

17:30.000 --> 17:34.000
Here's an example of how we want to return it.

17:34.000 --> 17:36.000
And then we paste in here the post content.

17:36.000 --> 17:40.000
And this content comes from a postgres database.

17:40.000 --> 17:49.000
Somewhere down here we have a query just a fetch author from a postgres database that we have running locally.

17:49.000 --> 17:58.000
That has a very small table here that just has one author and one bio.

17:58.000 --> 18:03.000
So let's run this modus dev.

18:03.000 --> 18:06.000
This is going to compile that.

18:06.000 --> 18:16.000
Oh, and we can also take a look at the modus JSON for this project.

18:16.000 --> 18:23.000
And we can see here that the model that we're using is meta's LM model hosted on hyper modes.

18:23.000 --> 18:27.000
This is an open source model provided by hugging face.

18:27.000 --> 18:34.000
I then also connecting to just a local postgres instance.

18:34.000 --> 18:39.000
Okay, so back to our API explorer.

18:39.000 --> 18:42.000
So now we have a couple of new endpoints.

18:42.000 --> 18:48.000
We have generate SEO and post content.

18:48.000 --> 18:50.000
So let's start with SEO.

18:50.000 --> 18:52.000
So I wrote a blog post earlier.

18:52.000 --> 18:59.000
Actually, Claude wrote a blog post for me about eating chocolate in Belgium.

18:59.000 --> 19:02.000
So we'll copy that.

19:02.000 --> 19:07.000
And so again imagine, this is the back end piece, the API piece.

19:07.000 --> 19:12.000
But imagine the front end of our app is here's some UI for writing a blog post.

19:12.000 --> 19:17.000
So writing this blog post about eating chocolate in Belgium.

19:17.000 --> 19:23.000
We want to automatically generate these HTML meta tags that are optimized for SEO.

19:23.000 --> 19:28.000
And again, there's a certain way that you write these where it's like verb first or something.

19:28.000 --> 19:32.000
I don't know how SEO works, but here's an example.

19:32.000 --> 19:35.000
Discover the art of Belgian chocolate making blah, blah, blah.

19:35.000 --> 19:39.000
And what's interesting is these LMs are non-deterministic.

19:39.000 --> 19:44.000
So every time we run this, we get a slightly different response.

19:44.000 --> 19:45.000
Which is kind of fun.

19:45.000 --> 19:48.000
Okay, so we're not like, that's not a rag example.

19:48.000 --> 19:51.000
We're not like passing any relevant information there.

19:51.000 --> 19:54.000
So let's look at the example for generating a title.

19:54.000 --> 19:58.000
And remember this one, we want to be in the style of the author.

19:58.000 --> 20:01.000
So that's me.

20:01.000 --> 20:03.000
Travel category.

20:03.000 --> 20:04.000
Let's say travel.

20:04.000 --> 20:07.000
And then here's the blog post again.

20:07.000 --> 20:10.000
And so what this is doing now is it's going to that post-gres database.

20:10.000 --> 20:15.000
Looking up the author, reading their bio and the prompt says,

20:15.000 --> 20:17.000
You know, in the style of the author based on this bio,

20:17.000 --> 20:22.000
If we had, you know, my full post history, we'd pass those along as well.

20:22.000 --> 20:29.000
And here we're getting selected suggested titles based on our blog content.

20:29.000 --> 20:34.000
And again, non-deterministic, so each time we get a slightly different response.

20:34.000 --> 20:37.000
Which is kind of fun.

20:37.000 --> 20:38.000
Cool.

20:38.000 --> 20:42.000
So that's sort of the basic naive rag approach.

20:42.000 --> 20:47.000
The next one here is function calling or tool use.

20:47.000 --> 20:51.000
And so this is how we want to implement like this basic,

20:51.000 --> 20:59.000
agentic workflow where the LLM is making some decision about the control flow of our app.

20:59.000 --> 21:02.000
And so we define some functions.

21:02.000 --> 21:04.000
You can look at the code for this.

21:04.000 --> 21:07.000
Let's stop this one.

21:07.000 --> 21:13.000
And here we go, zoom in a bit.

21:13.000 --> 21:19.000
So in this example, we have a very basic warehouse that has some products.

21:19.000 --> 21:26.000
So we have things like shirts, shoes, trousers, hats.

21:26.000 --> 21:28.000
And we have two functions.

21:28.000 --> 21:31.000
We have get product info.

21:31.000 --> 21:34.000
So you can pass a specific product that will say,

21:34.000 --> 21:37.000
here's how many shirts I have, here much they cost.

21:37.000 --> 21:42.000
And then we have get product types that will say we have shirts, trousers, shoes, hats.

21:42.000 --> 21:50.000
And we basically define these as tools that are available to the LLM.

21:50.000 --> 21:54.000
And in our LLM called based on the query that we have.

21:54.000 --> 21:59.000
And maybe just an interest of time, we'll think I have this one in the slides.

21:59.000 --> 22:03.000
Yeah, so we can pass a natural language query.

22:03.000 --> 22:06.000
This one says what's available to buy.

22:06.000 --> 22:09.000
And you can see here the response is the available products to buy.

22:09.000 --> 22:11.000
Our shoe hat trousers are in shirt.

22:11.000 --> 22:16.000
And we're also logging like the functions that the LLM chose to invoke.

22:16.000 --> 22:23.000
And in this case, it just invoked get product list to return what are the products that you have.

22:23.000 --> 22:27.000
But if we ask something we're complicated, this kind of runs off.

22:27.000 --> 22:32.000
How many complete outfits can I buy for $1,600, but I don't like hats.

22:32.000 --> 22:36.000
And so now you can see the different calls the LLM makes.

22:36.000 --> 22:39.000
Well, first it does get product info.

22:39.000 --> 22:41.000
shoe hat trousers shirts.

22:41.000 --> 22:44.000
And then now it's getting the price for each one of those.

22:44.000 --> 22:49.000
And it knows enough not to include a hat in my complete outfit.

22:49.000 --> 22:50.000
So I said I didn't like hats.

22:50.000 --> 22:53.000
So it knows a complete outfit is shoe trousers shirt.

22:53.000 --> 23:00.000
And I can buy two of those because they cost $800 all together.

23:00.000 --> 23:03.000
Very fancy clothing shop, I guess.

23:03.000 --> 23:06.000
Not somewhere I would be shopping.

23:06.000 --> 23:07.000
Okay, cool.

23:07.000 --> 23:11.000
So that's the basic idea of function calling and tool use.

23:11.000 --> 23:14.000
This next piece, knowledge graph rag.

23:14.000 --> 23:21.000
So this is where we want to leverage a knowledge graph to improve the retriever piece,

23:21.000 --> 23:26.000
to improve the data that we're fetching to pass to the context of the LLM.

23:26.000 --> 23:28.000
So in the first example,

23:28.000 --> 23:32.000
chaining data and LLM together, the naive rag approach,

23:32.000 --> 23:40.000
we're just doing a select star from author table where the name is my name.

23:40.000 --> 23:46.000
But we can do much more complicated, much more sophisticated retrievers

23:46.000 --> 23:48.000
to find relevant information.

23:48.000 --> 23:52.000
So for example, we could use embeddings.

23:52.000 --> 23:55.000
If we're dealing with lots of unstructured data,

23:55.000 --> 24:00.000
we can chunk up a bunch of text, calculate embeddings, do vector search.

24:00.000 --> 24:08.000
We can also traverse the knowledge graph to find more relevant pieces of data.

24:08.000 --> 24:11.000
And this is going to depend on what the domain is.

24:11.000 --> 24:16.000
But typically, if we're dealing with unstructured data and constructing a knowledge graph from that,

24:16.000 --> 24:20.000
we'll have our domain graph, which, in this case,

24:20.000 --> 24:22.000
we're going to build a movie recommendation app.

24:22.000 --> 24:26.000
So our domain graph, in this case, we have movies,

24:26.000 --> 24:30.000
good users that have rated a movie, so this is like a mashup of the movie lens,

24:30.000 --> 24:35.000
data sets, and IMDB movie lens was used for,

24:35.000 --> 24:41.000
I think, like, the Netflix prize for finding movie recommendations.

24:41.000 --> 24:45.000
But anyway, we have a domain graph, but then sometimes we also build,

24:45.000 --> 24:49.000
like, a lexical graph, where we've chunked up unstructured data,

24:49.000 --> 24:52.000
and that we're calculating embeddings based on that,

24:52.000 --> 24:56.000
and we can traverse from our lexical, into our domain graph, and sort of through there.

24:56.000 --> 24:58.000
There's this graphrag.com site.

24:58.000 --> 25:01.000
It has a lot of information on the different patterns,

25:01.000 --> 25:04.000
different retrievers that are used for graphrag.

25:04.000 --> 25:08.000
But if a couple minutes left, let's take a look at this example.

25:08.000 --> 25:13.000
So we're going to use embeddings, embeddings is basically just taking text,

25:13.000 --> 25:17.000
and converting that to a vector of numbers,

25:17.000 --> 25:20.000
and then we can do what's called vector search,

25:20.000 --> 25:27.000
vector similarity search, to find embedding values that are close together in vector space.

25:27.000 --> 25:31.000
So most databases support vector search now.

25:31.000 --> 25:35.000
Let's see if we have this one running, maybe.

25:35.000 --> 25:39.000
So we take a look at the code for this one.

25:39.000 --> 25:45.000
So we're connecting to an ear for J instance.

25:45.000 --> 25:50.000
We've already calculated these embeddings using the minilem model,

25:50.000 --> 25:55.000
which is, again, hosted by head of the code.

25:55.000 --> 25:59.000
So we're connecting to an ear for J instance.

25:59.000 --> 26:04.000
We've already calculated these embeddings using the minilem model,

26:04.000 --> 26:09.000
which is, again, hosted by head of the code, provided by hugging face.

26:09.000 --> 26:15.000
And then at query time, we're basically just doing vector search to find similar movies.

26:15.000 --> 26:18.000
So let's take a look at the app here.

26:18.000 --> 26:21.000
Is this running?

26:21.000 --> 26:25.000
So we search for a movie by title, size search for Cosmos,

26:25.000 --> 26:28.000
and we're looking that up in the database,

26:28.000 --> 26:33.000
and then doing vector search to find similar movies based on the embeddings.

26:33.000 --> 26:36.000
And the embedding of the, in this case, the plot.

26:36.000 --> 26:42.000
So that's the thing that we embedded in the database.

26:42.000 --> 26:50.000
So the code for this, if you're interested, is available in this modus recipes.

26:50.000 --> 26:51.000
GitHub repo.

26:51.000 --> 26:53.000
This one is specific for the movies kit.

26:53.000 --> 26:57.000
All the examples I showed today are in this modus recipes repo.

26:57.000 --> 27:02.000
This just shows different ways to use modus for building cool apps like that.

27:02.000 --> 27:06.000
And that is all I have, getting grab the slides here,

27:06.000 --> 27:08.000
and I think we are out of time as well.

27:08.000 --> 27:11.000
So thanks everyone, and we'll see you around.

27:11.000 --> 27:15.000
Thank you.

