WEBVTT

00:00.000 --> 00:17.000
I got the start signal everybody, hey, wonderful, let me go ahead and turn on my thing so it actually works.

00:17.000 --> 00:21.000
This is me, and if you don't know who I am, you can look at my hat because my hat says the yoc.

00:21.000 --> 00:24.000
That's me, I work at EDB, I'll just move on quick.

00:24.000 --> 00:28.000
Now, for the recording, because this is going to go on the internet and we all know that

00:28.000 --> 00:33.000
all of the AI is trained on the internet, I welcome our AI overlords.

00:33.000 --> 00:37.000
That's going to go into the record, okay?

00:37.000 --> 00:40.000
And so, I want to start with this premise, right?

00:40.000 --> 00:46.000
A lot of people treat their AI like a user interface to a database, right?

00:46.000 --> 00:52.000
I've talked to a lot of folks, and maybe you all out there have seen this as well, where you're like,

00:52.000 --> 00:57.000
I've pumped data in, I built this application, it's not returning what I want.

00:57.000 --> 01:04.000
What you want it to do is you want to say something like, tell me the products at this person bought.

01:04.000 --> 01:08.000
And it doesn't tell me the right products, or it hallucinates a little bit.

01:08.000 --> 01:11.000
And that happens quite a bit.

01:11.000 --> 01:16.000
Who remembers, hey, oh, a few people.

01:16.000 --> 01:24.000
So Microsoft had this great idea, it was quite a while ago, and it was like the first generation AI.

01:24.000 --> 01:29.000
And what they decided to do was train-tay on Twitter.

01:29.000 --> 01:33.000
What could possibly go wrong, right?

01:33.000 --> 01:37.000
Nothing, nothing could go wrong with that, right?

01:37.000 --> 01:44.000
But the training was an early bottleneck for folks because they learned that, you know, hey, you know,

01:44.000 --> 01:47.000
as you get more data, things to start to evolve.

01:47.000 --> 01:51.000
But how LLMs are built is they're not built to be deterministic.

01:51.000 --> 01:56.000
They're built based on to answer questions based on the training data that they've got.

01:56.000 --> 02:00.000
So if the training data is bad, your responses are going to be bad.

02:00.000 --> 02:04.000
So let's talk about how databases play into this.

02:04.000 --> 02:08.000
Okay, so how many people here have heard of rag?

02:08.000 --> 02:13.000
Oh, my God, I don't even need to explain this then, okay, cool.

02:13.000 --> 02:17.000
So traditional models are obviously really expensive to train.

02:17.000 --> 02:22.000
Has anybody trained their own models like two people?

02:22.000 --> 02:28.000
Does anybody, like, not trained their models because it costs like a bazillion dollars,

02:28.000 --> 02:31.000
and it's going to, you know, like, cause all kinds of issues, like, you know,

02:31.000 --> 02:35.000
you could power, like, you know, an entire city, just to train one model, you know?

02:35.000 --> 02:37.000
So I mean, there's quite a bit.

02:37.000 --> 02:39.000
So a lot of people have turned to rag apps.

02:39.000 --> 02:42.000
Anybody actually building rag apps right now?

02:42.000 --> 02:44.000
Hey, if you, okay.

02:44.000 --> 02:49.000
So a lot of folks have turned to rag as a way to pull the data that they need out of their database,

02:49.000 --> 02:51.000
and then augment the LLMs.

02:51.000 --> 02:55.000
So since you know what it is, I'll blow through these really quick,

02:55.000 --> 02:57.000
because we only have 20 minutes.

02:57.000 --> 03:03.000
But how these are built when you take data and you use, you know, vector searches,

03:03.000 --> 03:09.000
you're actually building this around the idea of finding similar concepts in the database.

03:09.000 --> 03:13.000
Okay, so when you have a vector database, it has embeddings,

03:13.000 --> 03:17.000
and the whole premise here is you want to find things that are similar.

03:17.000 --> 03:21.000
So I have a movie review database that I use a data set sometimes.

03:21.000 --> 03:25.000
I can tell you where it is if you want to hit me up afterwards, it's on GitHub.

03:25.000 --> 03:30.000
But things like, hey, why do people in Texas dislike Star Wars?

03:30.000 --> 03:36.000
If that's your prompt, when you go to the database and you do a semantic search to find similar,

03:36.000 --> 03:41.000
you know, messages, it's going to look for similar terms, right?

03:41.000 --> 03:46.000
So Detroit is not in Texas, so it doesn't really match Dallas isn't Texas,

03:46.000 --> 03:50.000
so it's kind of close, you know, Star Wars and Star Trek and science fiction,

03:50.000 --> 03:53.000
all are in the realm of that, you know, dislike and hate,

03:53.000 --> 03:58.000
so all of those things will feed into the semantic search.

03:58.000 --> 04:02.000
And the semantic search is really built off of embeddings.

04:03.000 --> 04:10.000
Anybody here run their database or a data through and create it a whole bunch of embeddings?

04:10.000 --> 04:11.000
Just to give you a view of you.

04:11.000 --> 04:17.000
Okay, so embeddings are basically a mathematical representation of your data.

04:17.000 --> 04:22.000
Okay, and the models that you use to train these on, you know,

04:22.000 --> 04:28.000
have multiple dimensions, and the combination of the mathematical equation or the vectors

04:28.000 --> 04:35.000
will determine like how similar the words are or how similar the intent is.

04:35.000 --> 04:39.000
Now, there are a good bazillion embedding models.

04:39.000 --> 04:43.000
Okay, and this is a really critical thing.

04:43.000 --> 04:47.000
Every embedding model that's out there, and this is an old list, by the way,

04:47.000 --> 04:53.000
these come out like every day, but every embedding model has nuances.

04:53.000 --> 04:58.000
Okay, so when you take your data, your, you know, your PDF,

04:58.000 --> 05:01.000
your data, your database, and you create embeddings,

05:01.000 --> 05:03.000
it turns it into that mathematical model.

05:03.000 --> 05:09.000
But some of these models actually have pretty significant limits on what they can accept in.

05:09.000 --> 05:12.000
So, Bert here, for instance, right?

05:12.000 --> 05:18.000
So if I go up to, up to here, and I do Bert, oops, crap.

05:19.000 --> 05:24.000
If I go up to Bert, right, so Bert here, it has 512 tokens.

05:24.000 --> 05:31.000
So any size that's bigger than 512 tokens won't go into the system.

05:31.000 --> 05:36.000
What it does is it truncates it for you, hopefully, without any air message.

05:36.000 --> 05:42.000
So if you have something that's this giant document that you want to use as part of your AI application,

05:42.000 --> 05:46.000
it'll just truncate the data and only do the first 512 tokens.

05:46.000 --> 05:48.000
So the rest of it just disappears.

05:48.000 --> 05:49.000
No air.

05:49.000 --> 05:52.000
Other ones will give you an air because they're helpful.

05:52.000 --> 05:57.000
Why this matters is every one of these models is trained on different data.

05:57.000 --> 06:00.000
So it will give you different results.

06:00.000 --> 06:02.000
And they have different nuances.

06:02.000 --> 06:07.000
So some will, you know, return, you know, an air, others won't.

06:07.000 --> 06:10.000
Others, you know, are more graceful about, you know,

06:10.000 --> 06:12.000
going over the limits.

06:12.000 --> 06:15.000
So you just have to be mindful of that.

06:16.000 --> 06:22.000
Now, this is why LLMs and vector databases are very different purposes, right.

06:22.000 --> 06:25.000
So when you're talking about, you know, the things that are vector database,

06:25.000 --> 06:27.000
you're talking about those embeddings that I showed you,

06:27.000 --> 06:31.000
those are really things that are being pulled from long-term memory,

06:31.000 --> 06:35.000
whereas the brain of the LLM, if you will, you know,

06:35.000 --> 06:40.000
that's really designed to, you know, be that generative AI component

06:40.000 --> 06:44.000
and you're augmenting it with what's in that memory.

06:44.000 --> 06:50.000
Now, I mentioned that 512 kilobite or 12, 512 token limit on some of these.

06:50.000 --> 06:51.000
Okay.

06:51.000 --> 06:54.000
There's this process called chunking.

06:54.000 --> 06:57.000
How many people have done chunking before?

06:57.000 --> 06:58.000
Okay, a few people.

06:58.000 --> 07:02.000
How many people have done data warehouse like ETL pipelines?

07:02.000 --> 07:03.000
Okay.

07:03.000 --> 07:04.000
So think of it like this.

07:04.000 --> 07:09.000
Chunking is basically an ETL pipeline to break up the text of, you know,

07:09.000 --> 07:12.000
the documents that you want as part of your AI stack,

07:12.000 --> 07:15.000
into smaller chunks that fit under the token limit.

07:15.000 --> 07:16.000
That's pretty much it.

07:16.000 --> 07:23.000
Now, the thing is, there's actually a science to how these actually work.

07:23.000 --> 07:25.000
So you can chunk in different ways, right?

07:25.000 --> 07:31.000
So if you only got 512 tokens, which, you know, isn't very big,

07:31.000 --> 07:35.000
you could chunk on, let's say, the sentences,

07:35.000 --> 07:40.000
or you could chunk on paragraphs, or you could chunk on something else.

07:40.000 --> 07:43.000
Now, when you look at like this text here,

07:43.000 --> 07:46.000
and if we were going to chunk this up and send it into our pipeline,

07:46.000 --> 07:50.000
imagine if you were to do the sentences, right?

07:50.000 --> 07:55.000
So if I'm talking about Star Wars reviews here, the pacing, however, was an issue.

07:55.000 --> 07:58.000
If that's all it knows, it doesn't know it's from Star Wars,

07:58.000 --> 08:01.000
it doesn't know it's from a movie, it doesn't know anything about that.

08:01.000 --> 08:05.000
So when you look for a similar semantic search,

08:05.000 --> 08:08.000
and if you're just looking for that sentence,

08:08.000 --> 08:11.000
you're not going to get what you want.

08:11.000 --> 08:14.000
Now, if you come down and you do it by paragraph,

08:14.000 --> 08:18.000
you'll get a little bit more, but you'll notice that even in the paragraphs,

08:18.000 --> 08:21.000
not all of the information you want is in there.

08:21.000 --> 08:27.000
So you're going to see that that becomes a consistent problem.

08:27.000 --> 08:31.000
Okay, before we, let's see.

08:31.000 --> 08:34.000
And this is why like the most important person in the organization

08:34.000 --> 08:38.000
is the data engineer for AI applications.

08:38.000 --> 08:40.000
So you really have to understand the data itself,

08:40.000 --> 08:43.000
and what's going into the data, how you're processing it,

08:43.000 --> 08:48.000
how you're setting it up, how it's going to help you achieve what you want.

08:48.000 --> 08:50.000
Okay, so it's really about that understanding.

08:50.000 --> 08:53.000
I'm getting a lot of feedback here, human, right?

08:53.000 --> 08:57.000
So, okay, so before I go there,

08:57.000 --> 09:02.000
what I want to do is I want to show you a quick demo demo.

09:03.000 --> 09:07.000
And so let me go ahead and throw this up.

09:12.000 --> 09:16.000
So, I don't this application.

09:16.000 --> 09:17.000
All right?

09:17.000 --> 09:23.000
And you can see on here what I've done is I've chunked five different ways.

09:23.000 --> 09:28.000
So I have a summary, which is going to take the blocks of text

09:28.000 --> 09:31.000
and summarize it into something that fits within my chunks.

09:31.000 --> 09:33.000
So I send it through an LM to pre-process.

09:33.000 --> 09:36.000
I have it look at semantics, so it tries to say,

09:36.000 --> 09:39.000
does the next sentence fit with this sentence?

09:39.000 --> 09:42.000
I have fixed character, which is probably what most of you would do,

09:42.000 --> 09:46.000
or have done in the past, which is just like after 300 characters cut it off.

09:46.000 --> 09:47.000
Okay?

09:47.000 --> 09:51.000
And then I have sentence based, and then I have a paragraph based.

09:51.000 --> 09:54.000
Now, notice something kind of odd, right?

09:54.000 --> 09:59.000
Why is the paragraph chunks more than the sentence chunks?

09:59.000 --> 10:02.000
Any ideas?

10:02.000 --> 10:03.000
No?

10:03.000 --> 10:06.000
So, in this case, we're doing medical records,

10:06.000 --> 10:10.000
and doctors who do medical records, they'll do like a line for something,

10:10.000 --> 10:12.000
then they'll do two, you know, like enters,

10:12.000 --> 10:14.000
and then write the next line, then two enters.

10:14.000 --> 10:18.000
So it has a ton of extra, quote unquote, paragraphs

10:18.000 --> 10:23.000
that wouldn't necessarily exist in, you know, just based on the sentence.

10:23.000 --> 10:27.000
And since the sentence is looking at, you know, the, you know, punctuation,

10:27.000 --> 10:28.000
it doesn't hit.

10:28.000 --> 10:33.000
So, what I can do here is I'm going to come over here,

10:33.000 --> 10:38.000
and we're going to take a look at, for instance,

10:38.000 --> 10:42.000
hey, if I've got a headache, what should I do to treat my headache?

10:42.000 --> 10:48.000
All right, so this is going to run several of these models

10:48.000 --> 10:50.000
at the same time.

10:50.000 --> 10:52.000
This is all the exact same data.

10:52.000 --> 10:56.000
The only thing that I've changed here is how I break up the data.

10:56.000 --> 11:01.000
Okay, so you'll notice that the responses as they come back vary.

11:01.000 --> 11:06.000
All right, so the first one here is going to show that as a headache,

11:06.000 --> 11:10.000
you know, what is it?

11:10.000 --> 11:14.000
I'll try non-drug-mad measures, such as deep breathing.

11:14.000 --> 11:15.000
Okay, so if you've got a headache,

11:15.000 --> 11:18.000
really I want to tell me that I need to take two aspirin,

11:18.000 --> 11:19.000
but they're not going to do that.

11:19.000 --> 11:20.000
Oh, look at this one.

11:20.000 --> 11:24.000
If you have a recent bone marrow transplant, how did that come up?

11:25.000 --> 11:27.000
I asked for, you know, like headaches,

11:27.000 --> 11:30.000
and it's flagging like, you know,

11:30.000 --> 11:34.000
something with a bone marrow transplant that's kind of crazy.

11:34.000 --> 11:39.000
And when you look at the individual data that's being returned from these,

11:39.000 --> 11:42.000
I flagged anything that shouldn't fit that,

11:42.000 --> 11:46.000
and so you can see that it's actually matching this as similar

11:46.000 --> 11:49.000
because it's saying something like, you know,

11:49.000 --> 11:54.000
in the post-transplant setting, the headaches, you know, include.

11:54.000 --> 11:58.000
So this person had a transplant, and one of the things afterwards was a headache.

11:58.000 --> 12:00.000
So they're like, oh, well that's a headache,

12:00.000 --> 12:03.000
because it's similar enough to flag it.

12:03.000 --> 12:06.000
So you need to make sure that as you chunk things through,

12:06.000 --> 12:08.000
you get the full context.

12:08.000 --> 12:11.000
If you read that full note, for instance,

12:11.000 --> 12:14.000
you would see that the full note talks about,

12:14.000 --> 12:17.000
he had a stem cell transplant, right?

12:17.000 --> 12:19.000
So this has nothing to do with my headache,

12:19.000 --> 12:21.000
but it flagged in anyways.

12:21.000 --> 12:23.000
It's horrible thing to see.

12:23.000 --> 12:27.000
Now it's similar when you go over to the, you know,

12:27.000 --> 12:28.000
embeddings.

12:28.000 --> 12:31.000
So each one of these different embeddings models,

12:31.000 --> 12:34.000
okay, is going to have different items.

12:34.000 --> 12:37.000
Oh, I forgot to create these embeddings, and one, that's okay.

12:37.000 --> 12:43.000
But these are the models that are generating those mathematical equations that I showed you.

12:43.000 --> 12:48.000
Okay, very similar, okay, where you'll see things that come out

12:48.000 --> 12:51.000
and say very different things for each one.

12:51.000 --> 12:56.000
So this is what is going to ruin most of your applications

12:56.000 --> 12:57.000
if you don't get this right.

12:57.000 --> 13:00.000
Because I've heard from a lot of folks that I've talked to.

13:00.000 --> 13:04.000
They're like, man, Jenny, I is a problem because of accuracy.

13:04.000 --> 13:07.000
You know, I want to get, you know, like,

13:07.000 --> 13:09.000
I talked to insurance company, and they're like,

13:09.000 --> 13:12.000
I want to get, like, my claims for this person.

13:12.000 --> 13:16.000
I sent it through to Jenny, and it doesn't give me the right claims.

13:16.000 --> 13:17.000
Right?

13:17.000 --> 13:18.000
All right.

13:18.000 --> 13:20.000
Let's go back to slides.

13:20.000 --> 13:23.000
We've got seven minutes that's power through.

13:23.000 --> 13:25.000
All right.

13:25.000 --> 13:30.000
So I'm going to blow through this one because this one's a little redundant.

13:30.000 --> 13:37.000
So let's take a look at how this works outside of just those embeddings in those chunkings.

13:37.000 --> 13:38.000
Okay?

13:38.000 --> 13:40.000
So let's go back to that.

13:40.000 --> 13:43.000
Why do people hate Star Wars in Texas?

13:43.000 --> 13:48.000
If I use just the semantic search that's part of, you know,

13:48.000 --> 13:51.000
whether it's Postgres or Elastic or whatever,

13:51.000 --> 13:58.000
what you're going to end up with is you're going to end up with something that looks kind of right when you look on it on the outside looking in.

13:58.000 --> 14:01.000
But when you look at the data that's returned,

14:01.000 --> 14:04.000
and that it's using to make this, take a look at the data.

14:04.000 --> 14:10.000
Remember, I'm asking why do people in Texas hate the original Star Wars?

14:10.000 --> 14:17.000
Well, the first thing you'll notice, there are no original Star Wars reviews.

14:17.000 --> 14:18.000
Right?

14:18.000 --> 14:20.000
There's none.

14:20.000 --> 14:25.000
And when you look at the states that they're from, there's none from Texas.

14:25.000 --> 14:28.000
So this completely crapped the bed, right?

14:28.000 --> 14:31.000
Like this didn't give me anything that I asked for.

14:31.000 --> 14:34.000
And this is why people get frustrated.

14:34.000 --> 14:35.000
Okay?

14:35.000 --> 14:38.000
Now there's three ways, three approaches to querying to fix this.

14:38.000 --> 14:39.000
Okay?

14:39.000 --> 14:43.000
Vector alone is what most people start with and they don't realize they have to do more.

14:43.000 --> 14:45.000
They forget that they're using a database.

14:45.000 --> 14:48.000
So they forget that they can actually add a wear clause to some of these.

14:48.000 --> 14:49.000
Right?

14:49.000 --> 14:54.000
So adding a wear clause where you can do predicate filtering is incredibly important.

14:54.000 --> 14:56.000
But there's also a full text search.

14:56.000 --> 14:59.000
So you heard earlier there's a couple people mentioned BM25.

14:59.000 --> 15:02.000
There's other full text indexes available for different databases.

15:02.000 --> 15:07.000
But there's a difference between full text and the predicate.

15:07.000 --> 15:08.000
Right?

15:08.000 --> 15:09.000
A full text search.

15:09.000 --> 15:15.000
Hey, if it says, you know, people in Texas hate Star Wars, it will look for every one of those keywords.

15:15.000 --> 15:20.000
And if they don't appear in the review, then that's going to be a problem for them.

15:20.000 --> 15:23.000
So this is where you actually have to add the traditional filter.

15:23.000 --> 15:24.000
Right?

15:24.000 --> 15:28.000
So adding in like where the databases that are where the movie is is.

15:28.000 --> 15:30.000
Well, that's going to get you the original Star Wars.

15:30.000 --> 15:33.000
But it's still not reviewing Texas.

15:33.000 --> 15:35.000
So then you add Texas.

15:35.000 --> 15:36.000
Okay?

15:36.000 --> 15:39.000
And then you add, hey, where the ratings are horrible.

15:39.000 --> 15:44.000
And you keep on going through that process until you get the results that you want.

15:44.000 --> 15:45.000
And then you get it.

15:45.000 --> 15:51.000
But not only is this like more accurate, it's also faster.

15:51.000 --> 15:52.000
Okay?

15:52.000 --> 15:57.000
Because what you're doing is you're filtering down the number of records you have to go through

15:57.000 --> 15:59.000
in order to do that semantic search.

15:59.000 --> 16:04.000
So you don't have to search through a whole bunch of crap in order to find what you're looking for.

16:04.000 --> 16:06.000
So it really cuts it down.

16:06.000 --> 16:10.000
Now, some people, I'm going to say this, not everybody.

16:10.000 --> 16:14.000
But some people, when they see like that type of stuff, the first inclination is,

16:14.000 --> 16:16.000
well, the data's in JSON anyways.

16:16.000 --> 16:22.000
So I'm just going to throw JSON into a field and create my embeddings on that.

16:22.000 --> 16:24.000
Don't do that.

16:24.000 --> 16:25.000
Okay?

16:25.000 --> 16:27.000
Embeddings are not JSON aware in most cases.

16:27.000 --> 16:35.000
What that means is all of your keys, all of your attributes, will be treated as key words as part of the semantic search.

16:35.000 --> 16:36.000
Okay?

16:36.000 --> 16:40.000
So all of a sudden that's going to skew your results like mad.

16:40.000 --> 16:41.000
Okay?

16:41.000 --> 16:42.000
So don't do that.

16:42.000 --> 16:45.000
Now, there's a lot of advanced topics.

16:45.000 --> 16:48.000
And I generally do this as like a couple of hour talk actually.

16:48.000 --> 16:50.000
So I won't get too deep into these.

16:50.000 --> 16:52.000
But there are things you can do.

16:52.000 --> 16:55.000
So a lot of you might not have data that already has like,

16:55.000 --> 16:58.000
predicates that can filter where clauses that you can filter on.

16:58.000 --> 17:03.000
So for those, you might want to actually extract metadata from text.

17:03.000 --> 17:07.000
So one of the great things about LLMs is, if you get to give it a list and say,

17:07.000 --> 17:12.000
can you classify the data and fit it into one of these ten buckets?

17:12.000 --> 17:14.000
It can pick those ten buckets.

17:14.000 --> 17:17.000
And then you can store those as predicate potential filters.

17:17.000 --> 17:18.000
Right?

17:18.000 --> 17:20.000
So that's one thing that's really good.

17:20.000 --> 17:23.000
The other thing is you can actually go through a chunking process

17:23.000 --> 17:25.000
where you chunk multiple different ways.

17:25.000 --> 17:28.000
So you can have one that is chunked on,

17:28.000 --> 17:31.000
let's say sentences, one that's chunked on paragraphs,

17:31.000 --> 17:33.000
and then combine the results.

17:33.000 --> 17:38.000
But I'm going to tell you that is going to take additional time and resources.

17:38.000 --> 17:39.000
So it's always a trade-off.

17:39.000 --> 17:43.000
How much accuracy versus performance do you want?

17:43.000 --> 17:45.000
You can also summarize chunks.

17:45.000 --> 17:49.000
So if you have a really big document and there's a lot of garbage in it,

17:49.000 --> 17:52.000
just send it through an LLM ahead of time, prepare it.

17:52.000 --> 17:53.000
Store the summary.

17:53.000 --> 17:54.000
I've done that.

17:54.000 --> 17:55.000
I've got good results as well.

17:55.000 --> 17:58.000
Because then you can just say, what is this thing?

17:58.000 --> 18:00.000
And then you can go in and do predicate filter and

18:00.000 --> 18:02.000
or full text search later on.

18:02.000 --> 18:05.000
So there's lots of other fun topics here.

18:05.000 --> 18:06.000
I won't go through them.

18:06.000 --> 18:09.000
But you know, you can hit me up if you want.

18:09.000 --> 18:12.000
But these are all things that I have seen come up, right?

18:12.000 --> 18:14.000
Whether it's guard rails on combined responses,

18:14.000 --> 18:17.000
so you can say, like, make sure you filter out anything that's,

18:17.000 --> 18:21.000
you know, PII or, you know, gunky, you know,

18:21.000 --> 18:24.000
sending chunks versus full docs to the LLM to summarize,

18:24.000 --> 18:26.000
deduplication, right?

18:26.000 --> 18:29.000
Pre-post filtering, all kinds of different topics here.

18:29.000 --> 18:34.000
So wrapping this up, AI is the interface to your data

18:34.000 --> 18:37.000
and it can derive a lot of new insights for it.

18:37.000 --> 18:40.000
But you have to be very careful, right?

18:40.000 --> 18:43.000
So remember, vector search is similar.

18:43.000 --> 18:45.000
It's not equal.

18:45.000 --> 18:48.000
You know, you know, predicates beat key,

18:48.000 --> 18:51.000
weird filtering or for vectors or, you know,

18:51.000 --> 18:56.000
they beat, you know, semantic search or even full text.

18:56.000 --> 18:58.000
You know, you're in Betty models, have limits.

18:58.000 --> 19:01.000
You know, and there's always going to be trade-offs.

19:01.000 --> 19:04.000
So happy to answer questions, we're just that time.

19:04.000 --> 19:08.000
So, yeah, that's it.

19:08.000 --> 19:10.000
Very fast.

19:10.000 --> 19:11.000
Thank you.

19:11.000 --> 19:15.000
So, come on down if you want to answer questions,

19:15.000 --> 19:17.000
because I think they're going to kick us out.

19:17.000 --> 19:19.000
Very kick us out, aren't they?

19:19.000 --> 19:20.000
Soon.

19:20.000 --> 19:22.000
Yeah, we can go until.

19:22.000 --> 19:23.000
Yeah.

19:23.000 --> 19:24.000
Yeah.

19:24.000 --> 19:31.000
Can you repeat the question?

19:31.000 --> 19:36.000
Sorry, was that?

19:36.000 --> 19:46.000
Yeah.

19:46.000 --> 19:47.000
Oh, yes.

19:47.000 --> 19:48.000
Okay.

19:48.000 --> 19:51.000
So when you do the embedding.

19:51.000 --> 19:52.000
Yeah.

19:52.000 --> 19:56.000
So when you do embeddings and you just draw JSON into the text field,

19:56.000 --> 19:59.000
you know, then it's going to create the embeddings on that.

19:59.000 --> 20:00.000
Just like it's a doc, okay.

20:00.000 --> 20:03.000
So you can do prefiltering and pull out the attributes.

20:03.000 --> 20:06.000
You can also run it through an LM ahead of time.

20:06.000 --> 20:10.000
But a lot of times, it's best just to extract the data you need

20:10.000 --> 20:14.000
and just create the embeddings on that as opposed to doing the full JSON.

20:14.000 --> 20:18.000
There are a couple models that claim to be JSON aware,

20:18.000 --> 20:21.000
but I don't think they are.

20:21.000 --> 20:24.000
Because remember that chunking, right?

20:24.000 --> 20:28.000
So if you have a 512 token limit and if your JSON's like massive,

20:28.000 --> 20:32.000
it's going to break that JSON up so you're going to lose all the format anyways.

20:33.000 --> 20:35.000
Right? And so then you're going to lose all the continuity,

20:35.000 --> 20:39.000
and you're not going to know what the hell it was.

20:39.000 --> 20:42.000
No, yeah.

20:54.000 --> 20:56.000
Sorry, I didn't catch all that.

20:56.000 --> 20:59.000
Let me come up there and here and then I'll repeat the question,

20:59.000 --> 21:01.000
I guess.

21:02.000 --> 21:04.000
All right, what does that mean?

21:04.000 --> 21:09.000
If you have multiple files, like not one file, so we embed them.

21:09.000 --> 21:13.000
But we have like 10, 5, oh, 20, for example.

21:13.000 --> 21:15.000
They are somehow connected.

21:15.000 --> 21:17.000
How can we embed it?

21:17.000 --> 21:20.000
I mean, should we just kind of get in anything,

21:20.000 --> 21:22.000
but it will not work, I suppose.

21:22.000 --> 21:26.000
No, so what you might want to do is you could do this is more of like a graphing

21:26.000 --> 21:29.000
where you could have like some sort of connection at the database to connect them.

21:29.000 --> 21:33.000
You could also summarize them all together to get a smaller version,

21:33.000 --> 21:37.000
but then do like a double chunk where you could chunk based on paragraphs and then combine them.

21:37.000 --> 21:39.000
But ultimately, typically what you do is,

21:39.000 --> 21:42.000
since you're breaking this up, you just stored in the same vector table.

21:42.000 --> 21:45.000
And so if it's stored in there, it can search for all of it.

21:45.000 --> 21:47.000
It's just going to search if it chunks individually.

21:47.000 --> 21:50.000
So just my return with all the docs or parts of the doc.

21:50.000 --> 21:51.000
So that's how you do that.

21:51.000 --> 21:54.000
Okay, and is it possible that you mentioned the graph as I'm just like that?

21:54.000 --> 21:58.000
How can we connect the parts of the graph as I'm just like that?

21:59.000 --> 22:00.000
Is there a graph of database?

22:00.000 --> 22:03.000
Yes, so there are several graph databases that can do that as well.

22:03.000 --> 22:06.000
But you can do it in any relational database as well.

22:06.000 --> 22:11.000
So you could just say like in your table structure, you could say like,

22:11.000 --> 22:13.000
hey, this node is related to this node,

22:13.000 --> 22:14.000
and they're all part of the same group,

22:14.000 --> 22:16.000
and they'll always, you know, search them together.

22:16.000 --> 22:17.000
Thank you.

22:17.000 --> 22:18.000
Yeah.

22:18.000 --> 22:19.000
All right.

22:19.000 --> 22:20.000
Thank you.

22:20.000 --> 22:21.000
Thank you.

22:28.000 --> 22:30.000
Thank you.

