WEBVTT

00:00.000 --> 00:09.600
They are bases can play a role in enhancing AI.

00:09.600 --> 00:12.000
Please welcome Bruce.

00:12.000 --> 00:13.000
Great.

00:13.000 --> 00:15.000
Thank you everybody.

00:15.000 --> 00:20.000
I am the closer today for this room.

00:20.000 --> 00:21.000
I'm excited.

00:21.000 --> 00:25.000
I was here two years ago talking about home automation.

00:25.000 --> 00:27.000
Anyone in the session?

00:27.000 --> 00:28.000
Two years ago?

00:28.000 --> 00:29.000
Well, okay.

00:29.000 --> 00:32.000
My thank you very much.

00:32.000 --> 00:34.000
So this talk is completely different.

00:34.000 --> 00:36.000
I love home automation, but not today.

00:36.000 --> 00:40.000
We're going to be talking about AI.

00:40.000 --> 00:41.000
My name is Bruce Lomger.

00:41.000 --> 00:42.000
I live in Philadelphia.

00:42.000 --> 00:43.000
I work for EDB.

00:43.000 --> 00:46.000
I'm on a five-six city trip.

00:46.000 --> 00:48.000
I spoke in Prague earlier in the week.

00:48.000 --> 00:51.000
I'm going to Berlin on Tuesday to speak.

00:51.000 --> 00:55.000
I'll be speaking also in Rihard and Amsterdam the week after.

00:55.000 --> 00:58.000
A lot of fun and then 10 days, 12 days after that.

00:58.000 --> 01:02.000
I'll be in Montreal and I'm doing a three week trip in India.

01:02.000 --> 01:04.000
In March and then maybe go to Hong Kong.

01:04.000 --> 01:07.000
So we'll see a lot of excitement.

01:07.000 --> 01:10.000
But this talk I will warn you.

01:10.000 --> 01:12.000
Can stress your brain.

01:12.000 --> 01:13.000
So just be aware.

01:13.000 --> 01:15.000
This is pretty complicated.

01:15.000 --> 01:17.000
I'll explain why it's complicated a minute.

01:17.000 --> 01:19.000
But this is going to be.

01:19.000 --> 01:22.000
It might be a challenge, but I'm hoping you'll get a lot out of it.

01:22.000 --> 01:24.000
I certainly got a lot out of it writing it.

01:24.000 --> 01:25.000
All right.

01:25.000 --> 01:29.000
So if you've heard my AI talks before you have not heard this one.

01:29.000 --> 01:30.000
Because I just wrote it.

01:30.000 --> 01:34.000
About three months ago, this is the second time I'm giving it again.

01:34.000 --> 01:36.000
I gave it first in Prague.

01:36.000 --> 01:42.000
My talk from 2020 is called Postgres in the Artificial Intelligence landscape.

01:42.000 --> 01:44.000
And there's the URL right there.

01:44.000 --> 01:48.000
In fact, my website is right here, right up here.

01:48.000 --> 01:52.000
And in fact, if you'd like to download this presentation or

01:52.000 --> 01:57.000
61 other presentations with over 100 videos and 600 blogs about Postgres,

01:57.000 --> 01:59.000
that is the place to go.

01:59.000 --> 02:00.000
And there's the QR code for it.

02:00.000 --> 02:03.000
So this is the other presentation I wrote on AI.

02:03.000 --> 02:08.000
It's more talking about machine learning, deep learning, and effectively what we call

02:08.000 --> 02:09.000
discriminative AI.

02:09.000 --> 02:12.000
I'll talk about what that means in a minute.

02:12.000 --> 02:17.000
So again, if you're looking for something a little earlier technology for AI

02:17.000 --> 02:21.000
and still valid technology, you might want to take a look at that talk.

02:21.000 --> 02:26.000
But today, we're going to talk about in the next 50 minutes AI explosion.

02:26.000 --> 02:31.000
Hey, this, the type of AI I'm talking about today has only been around for two years.

02:31.000 --> 02:32.000
Okay?

02:32.000 --> 02:35.000
So we're kind of in the wild, wild west, right?

02:35.000 --> 02:40.000
If you technologies are that new that we deal with, right?

02:40.000 --> 02:48.000
In fact, in fact, it's so new that this week we had news about deep seek out of China,

02:48.000 --> 02:52.000
which is a different way of doing AI and potentially more efficient way.

02:52.000 --> 02:57.000
Now, we can argue whether it's actually unique or not and what ways it's more efficient.

02:57.000 --> 03:03.000
But again, the concept that this is technology is two years old and almost every month

03:03.000 --> 03:09.000
we're getting a seismic shift in the industry kind of interesting.

03:09.000 --> 03:10.000
Okay?

03:10.000 --> 03:16.000
The goal of this talk is to really go down into generative AI,

03:16.000 --> 03:19.000
explain how it works.

03:19.000 --> 03:20.000
Okay?

03:20.000 --> 03:22.000
And again, that's only two-year-old technology.

03:22.000 --> 03:27.000
And then explain how that's used for what we call semantic or vector search.

03:27.000 --> 03:33.000
We're going to talk about how it's used for rag or retrieval augmented generation.

03:33.000 --> 03:38.000
And we'll finally talk about how databases are used for AI.

03:38.000 --> 03:45.000
The fundamental, you can't decide how to use a database for AI unless you understand how AI works.

03:45.000 --> 03:47.000
You can sort of throw it out.

03:47.000 --> 03:49.000
You've got to go down to the roots.

03:49.000 --> 03:51.000
Figure out what's going on.

03:51.000 --> 03:53.000
Where does database make sense?

03:53.000 --> 03:54.000
Where does AI make sense?

03:54.000 --> 03:56.000
So this is a kind of a two-part talk.

03:56.000 --> 04:00.000
Most of it's about AI and toward the end, we bring in databases to side where they make sense.

04:00.000 --> 04:01.000
Okay?

04:01.000 --> 04:04.000
I would love to be taking questions while I talk.

04:04.000 --> 04:05.000
I will try.

04:05.000 --> 04:09.000
But I am also cautious that I've 85 slides in 50 minutes,

04:09.000 --> 04:13.000
so I will try and leave most of the questions to the end.

04:13.000 --> 04:15.000
So AI explosion.

04:15.000 --> 04:19.000
Basically, let me flip this around here along here.

04:19.000 --> 04:23.000
ChatGBT released November 2022.

04:23.000 --> 04:26.000
So again, two years, two months.

04:26.000 --> 04:28.000
Effectively at this point.

04:28.000 --> 04:33.000
It was probably the first tip time where AI was accessible to the public

04:33.000 --> 04:36.000
in a human-like conversation.

04:36.000 --> 04:37.000
All right?

04:37.000 --> 04:40.000
And that's really, I think, what got everyone excited about it.

04:40.000 --> 04:43.000
And I'm going to explain toward the end of the talk,

04:43.000 --> 04:46.000
how some of the big players like Google missed out on this.

04:46.000 --> 04:51.000
In fact, Google's a specific example that really could have led this,

04:51.000 --> 04:54.000
and officially, unfortunately, for then, missed the boat.

04:54.000 --> 04:59.000
This series right here, this YouTube playlist,

04:59.000 --> 05:04.000
is the best AI playlist I have ever seen.

05:04.000 --> 05:06.000
It is made up of six videos.

05:06.000 --> 05:08.000
Each video is probably an hour long.

05:08.000 --> 05:10.000
So again, feel free to do that.

05:10.000 --> 05:15.000
If you download these slides, anything in blue, you can click on.

05:15.000 --> 05:16.000
Okay?

05:16.000 --> 05:18.000
So you don't even need to worry about it.

05:18.000 --> 05:22.000
Download the slides, and some I gave this talk in Prague,

05:22.000 --> 05:23.000
and the person said,

05:23.000 --> 05:26.000
I'm going to be spending quite a lot of time on this slide deck.

05:26.000 --> 05:27.000
Okay?

05:27.000 --> 05:31.000
Because you basically can drill down one of the goals of my slide decks,

05:31.000 --> 05:34.000
as you can drill down into every single slide on most,

05:34.000 --> 05:35.000
and get more detail.

05:35.000 --> 05:40.000
Because obviously, I can only cover so much in my talk.

05:40.000 --> 05:41.000
Okay?

05:41.000 --> 05:45.000
Again, we have some pictures this year I'll be down here about.

05:45.000 --> 05:49.000
About how things have changed and how chatGBT changed.

05:49.000 --> 05:51.000
Okay?

05:51.000 --> 05:53.000
I talked about two types of AI.

05:53.000 --> 05:57.000
Up until 2022, all of the AI that we normally worked with

05:57.000 --> 06:02.000
was what we called discriminative AI or predictive AI.

06:02.000 --> 06:06.000
And again, the talk about postgres and the artificial intelligence landscape,

06:06.000 --> 06:11.000
which I grew in 2020, is exactly about that.

06:11.000 --> 06:14.000
It's exactly about discriminative AI.

06:14.000 --> 06:19.000
Now, discriminative AI is primarily focused on classification.

06:19.000 --> 06:20.000
Okay?

06:20.000 --> 06:25.000
For example, is a credit card charge fraudulent or not?

06:25.000 --> 06:26.000
Yes or no?

06:26.000 --> 06:27.000
Okay?

06:27.000 --> 06:31.000
Is this a picture of a dog at Cat where a lion?

06:31.000 --> 06:32.000
Okay?

06:32.000 --> 06:33.000
Yes or no?

06:33.000 --> 06:34.000
Which one is it?

06:34.000 --> 06:37.000
Recommendations for web pages.

06:37.000 --> 06:41.000
I'd say here, predictive trends, language translation.

06:41.000 --> 06:46.000
Again, down here, difference between generative and predictive AI.

06:46.000 --> 06:49.000
This is AI up until 2022.

06:49.000 --> 06:50.000
Nothing wrong with it.

06:50.000 --> 06:54.000
There's perfect legal legitimate reasons to continue to use it.

06:54.000 --> 06:57.000
I am going to be, that's what my first talk does focus on.

06:57.000 --> 07:02.000
This talk is going to be about the 2022 forward generative part.

07:02.000 --> 07:03.000
Okay?

07:03.000 --> 07:10.000
So generative AI, again, initially coming from open AI, was revolutionary because it changed

07:10.000 --> 07:17.000
the public perception of AI by releasing something that either generative text or images.

07:17.000 --> 07:21.000
And the concept is it generated new content.

07:21.000 --> 07:26.000
It did not classify existing content, it created new content.

07:26.000 --> 07:29.000
And that was the revolutionary part.

07:29.000 --> 07:34.000
And you've got to give opening a credit for coming up with this concept.

07:34.000 --> 07:37.000
So, what are the generative AI use cases?

07:37.000 --> 07:39.000
There's a whole bunch of them.

07:39.000 --> 07:42.000
Chat bots is a big one.

07:42.000 --> 07:43.000
Semantic vector search.

07:43.000 --> 07:45.000
I'll talk about that today.

07:45.000 --> 07:47.000
Summarization of text.

07:47.000 --> 07:50.000
Now, summarization taking a longer text is summarizing it.

07:50.000 --> 07:55.000
That is actually a unique generative use case.

07:55.000 --> 08:00.000
It's not, you need a special engine to do just summarization.

08:00.000 --> 08:01.000
Okay?

08:01.000 --> 08:08.000
Open AI effectively takes several of these different generative options and puts a single interface to it.

08:08.000 --> 08:12.000
But effectively, when you're trying to do some of the different language translation again,

08:12.000 --> 08:15.000
I'm very impressive.

08:15.000 --> 08:18.000
For example, I'm Armenian.

08:18.000 --> 08:21.000
I'm in the United States, so we speak Western Armenian.

08:21.000 --> 08:25.000
Most every web page only translates Eastern Armenian.

08:25.000 --> 08:27.000
Chat GBT will do Western Armenian.

08:27.000 --> 08:28.000
Chaka.

08:28.000 --> 08:33.000
First case, I've been able to actually generate Western Armenian from English and back and forth.

08:33.000 --> 08:35.000
So, that was a shocker to me.

08:35.000 --> 08:37.000
Generating audio, video, images.

08:37.000 --> 08:40.000
This was crazy.

08:40.000 --> 08:44.000
I was on a channel with some homage people.

08:44.000 --> 08:45.000
I can't describe it.

08:45.000 --> 08:49.000
But effectively, this homage guy loves lobster.

08:49.000 --> 08:53.000
So, I asked Chat GBT to do a dolly, which is Chat GBT.

08:53.000 --> 08:54.000
I asked it to open it.

08:54.000 --> 09:00.000
I said, give me a picture of an homage man riding a lobster.

09:00.000 --> 09:04.000
And it actually generated a guy with a hat when the lobster.

09:04.000 --> 09:07.000
And I sent it to the channel and everyone laughed.

09:07.000 --> 09:10.000
So, again, crazy stuff you can do.

09:10.000 --> 09:12.000
Code generation code analysis.

09:12.000 --> 09:16.000
I've actually had some success with this.

09:16.000 --> 09:20.000
So, for example, I wanted to translate some pearl the python.

09:20.000 --> 09:24.000
And I could basically send it to code and it would translate it for me.

09:24.000 --> 09:25.000
New capability.

09:25.000 --> 09:27.000
I sent some python code up.

09:27.000 --> 09:30.000
I asked for recommendations out and prove it.

09:30.000 --> 09:31.000
And it came up with that.

09:31.000 --> 09:33.000
Again, that's a whole new thing.

09:33.000 --> 09:35.000
We're in the wild last.

09:35.000 --> 09:37.000
We don't actually know all the things we can do yet.

09:37.000 --> 09:39.000
Programming language conversion.

09:39.000 --> 09:42.000
And again, this is kind of a cool one.

09:42.000 --> 09:44.000
Dactual language interface.

09:44.000 --> 09:48.000
So, for example, the ability to generate SQL from a user query.

09:48.000 --> 09:51.000
And I'll show you some examples of that as we go forward.

09:51.000 --> 09:54.000
And again, some nice URLs right here.

09:54.000 --> 09:55.000
Okay.

09:55.000 --> 09:57.000
So, to kind of show you the differences.

09:57.000 --> 10:02.000
The discriminative supervised learning requires training data with known outcomes.

10:02.000 --> 10:04.000
Often problem remains specific.

10:04.000 --> 10:07.000
And output is determined at determination or prediction.

10:07.000 --> 10:11.000
Generative outcomes are determined from massive training data.

10:11.000 --> 10:15.000
You're talking, like, terabytes, petabytes of data.

10:15.000 --> 10:21.000
Usually it's general and it outputs new content instead of being predictive.

10:21.000 --> 10:23.000
Okay.

10:23.000 --> 10:25.000
And again, the description of that.

10:25.000 --> 10:28.000
And again, description down here.

10:28.000 --> 10:30.000
The basic issue is that discriminative.

10:30.000 --> 10:32.000
This is a great illustration.

10:32.000 --> 10:39.000
Discriminative is trying to find the boundaries between the various options.

10:39.000 --> 10:40.000
Okay.

10:40.000 --> 10:45.000
Where is generative is effectively generating new tax new image.

10:45.000 --> 10:46.000
Okay.

10:46.000 --> 10:47.000
You understand the difference.

10:47.000 --> 10:53.000
It's kind of the question of whether you're looking at discrete options or whether you're just generating brand new stuff.

10:53.000 --> 10:55.000
This is what generative looks like.

10:55.000 --> 10:58.000
And I'm going to explain why it looks that way in a couple of minutes.

10:58.000 --> 11:02.000
But effectively, instead of looking for boundaries.

11:03.000 --> 11:09.000
It creates what we call embedding vectors or words or parts of words.

11:09.000 --> 11:13.000
And effectively, those embedding vectors are used to generate new content.

11:13.000 --> 11:17.000
And I'm explaining attention blocks and how that works as we go forward.

11:17.000 --> 11:20.000
Any questions?

11:20.000 --> 11:21.000
Okay.

11:21.000 --> 11:22.000
Great.

11:22.000 --> 11:23.000
All right.

11:23.000 --> 11:25.000
Now, here's going to be a little bit of the brain stuff.

11:25.000 --> 11:26.000
Okay.

11:26.000 --> 11:27.000
I was a math teacher.

11:27.000 --> 11:28.000
I measured it.

11:28.000 --> 11:29.000
I took them.

11:29.000 --> 11:31.000
I was a minor in mathematics.

11:31.000 --> 11:36.000
So we're going to walk through some things here just carry along.

11:36.000 --> 11:45.000
First, I want to explain that mathematics allows you to represent concepts that don't exist in the real world.

11:45.000 --> 11:48.000
Mathematics allows you to represent concepts that don't exist in the real world.

11:48.000 --> 11:56.000
And I'm going to explain why that's important as we start to talk about some of these embedding vectors and how AI works.

11:57.000 --> 12:03.000
Because some of the stuff it does really can't be understood in the physical world.

12:03.000 --> 12:06.000
You have to think about it in a mathematical way.

12:06.000 --> 12:07.000
I'll be my best to explain that.

12:07.000 --> 12:14.000
For example, the number of atoms in the universe is 10 to the 80th.

12:14.000 --> 12:16.000
I don't know what that means.

12:16.000 --> 12:17.000
It's so many zeros.

12:17.000 --> 12:21.000
I can't really fathom how big that number is.

12:21.000 --> 12:22.000
All right.

12:22.000 --> 12:26.000
The general you consider number of how many atoms are in the universe.

12:26.000 --> 12:30.000
I kind of remember that number because I'm going to reference it later.

12:30.000 --> 12:31.000
Secondly, infinity.

12:31.000 --> 12:33.000
I don't know how many of you deal with infinity.

12:33.000 --> 12:39.000
But effectively, there's a whole mathematical branch that talks about how infinity is used in various contexts.

12:39.000 --> 12:41.000
We don't know what affinity is.

12:41.000 --> 12:46.000
We can't really represent it in the real world because everything we deal with this finite, right?

12:46.000 --> 12:50.000
But again, infinity has a representation in mathematics.

12:51.000 --> 13:00.000
Third option, derivatives in calculus are trying to deal with infinitesimally small adjustments of a line of excitement.

13:00.000 --> 13:03.000
We have a mathematical representation for that.

13:03.000 --> 13:08.000
But again, the physical concept of an infinitesimally small something we can't really represent.

13:08.000 --> 13:10.000
Same thing with integration.

13:10.000 --> 13:13.000
An infinite sum of infinitesimally small slices.

13:13.000 --> 13:18.000
Again, you can't represent it, but mathematics has no trouble with it.

13:19.000 --> 13:23.000
So another concept mathematical is called duality.

13:23.000 --> 13:28.000
Duality says that we can think of mathematics into the physical sense.

13:28.000 --> 13:33.000
We can think of it in a computer science sense, like in a race sense.

13:33.000 --> 13:37.000
And we all can consider it in an abstract or mathematical sense.

13:37.000 --> 13:38.000
They're all the same thing.

13:38.000 --> 13:40.000
They all represent the same thing.

13:40.000 --> 13:42.000
They're just different ways of looking at a number.

13:42.000 --> 13:46.000
Again, we have a URL here that talks about duality if you're really into that.

13:47.000 --> 13:52.000
You can see why that person said I'm going to be spending a lot of time on these slides because there's a lot of stuff here.

13:52.000 --> 13:59.000
All right, so what I'm going to talk about right now is what we call embedding vectors.

13:59.000 --> 14:01.000
I'm going to explain how we create them.

14:01.000 --> 14:07.000
But that effectively is an embedding vector for five words, six words.

14:07.000 --> 14:11.000
But I only have two dimensions on this screen.

14:11.000 --> 14:19.000
Okay, what I have tried to do is represent the concept of pink using 10 dimensions.

14:19.000 --> 14:26.000
And the way I did it was I flipped the pink right here on that spot on this graph.

14:26.000 --> 14:36.000
And in that graph, there's a graph up in the corner, which has another set of X y axis.

14:36.000 --> 14:39.000
And then from there, there's another one here and there's another one.

14:39.000 --> 14:44.000
So this is the one, this is two, this is three, this is four, this is five, it's 10.

14:44.000 --> 14:46.000
Okay, because each one is two dimensions.

14:46.000 --> 14:51.000
So that's 10 dimensions of pink, right.

14:51.000 --> 14:59.000
Most embedding vectors have between the thousand and 12,000 dimensions.

15:00.000 --> 15:05.000
I personally cannot fathom what 12,000 dimensions looks like.

15:05.000 --> 15:08.000
Okay, because we only live in a three-dimensional space.

15:08.000 --> 15:13.000
Even if I could show this graph in three dimensions, it still only gets me the three.

15:13.000 --> 15:16.000
It's not going to get me to a thousand or 12,000.

15:16.000 --> 15:22.000
So you have to empirically think when I'm looking at an vector, it doesn't have two dimensions.

15:22.000 --> 15:29.000
It has a thousand or 12,000 different values for each dimension.

15:29.000 --> 15:31.000
Okay.

15:31.000 --> 15:35.000
And in fact, yeah, so that check you continues to thousand and 12,000 dimensions.

15:35.000 --> 15:39.000
Again, really big and again, we have a URL right here.

15:39.000 --> 15:42.000
Okay, so how big is that?

15:42.000 --> 15:47.000
Each dimension is four by floating value.

15:48.000 --> 15:54.000
Okay. If we look at the range of a floating point value, which has an exponent, mantissa.

15:54.000 --> 15:55.000
Okay.

15:55.000 --> 16:04.000
We're basically looking at 1,000 to 12,000 dimensions, being 9,000,

16:04.000 --> 16:10.000
pent to the 9,000 versus 10 to the 118,000.

16:10.000 --> 16:12.000
Okay.

16:12.000 --> 16:17.000
So we're talking about how many atoms of the universe?

16:17.000 --> 16:22.000
And in the 80, so 10 to the 80 dimension, atoms of the universe,

16:22.000 --> 16:27.000
and we have each vector representing 10 to the 9,000,

16:27.000 --> 16:32.000
or 10 to the 118,000 possible values.

16:32.000 --> 16:34.000
Okay.

16:34.000 --> 16:39.000
This is, you can see why mathematics is necessary because we're dealing at a scale.

16:39.000 --> 16:46.000
It's just way off the charts for anything we can understand in the physical world.

16:46.000 --> 16:47.000
Okay.

16:47.000 --> 16:48.000
I'm a cool.

16:48.000 --> 16:51.000
Again, I have a URL here to give you some ideas.

16:51.000 --> 16:52.000
Okay.

16:52.000 --> 16:55.000
So notice that each vector is the same length.

16:55.000 --> 16:59.000
That is actually requirement for embedding vectors.

16:59.000 --> 17:00.000
Okay.

17:00.000 --> 17:01.000
Again, some URLs there.

17:01.000 --> 17:02.000
Okay.

17:02.000 --> 17:04.000
Any questions?

17:04.000 --> 17:05.000
Great.

17:05.000 --> 17:06.000
Great.

17:06.000 --> 17:07.000
Thank you.

17:07.000 --> 17:10.000
Question three, creating text embeddings.

17:10.000 --> 17:11.000
This is where it gets exciting.

17:11.000 --> 17:14.000
If you weren't excited before yet, this is it.

17:14.000 --> 17:20.000
I've explained that the way you do generative AI is you create these vectors,

17:20.000 --> 17:24.000
which are literally 1,000 to 12,000 dimensions.

17:24.000 --> 17:27.000
Again, huge number of possible values.

17:27.000 --> 17:30.000
But what, how do you create these embeddings?

17:30.000 --> 17:35.000
And I will tell you, because open AI is so new,

17:35.000 --> 17:41.000
it was incredibly difficult to get all this information into this slide deck.

17:41.000 --> 17:47.000
Because we haven't really codified and solidified the knowledge, right?

17:47.000 --> 17:48.000
It's only two years old.

17:48.000 --> 17:51.000
So there's some people in this little piece and some people in that piece and some people

17:51.000 --> 17:52.000
explain this.

17:52.000 --> 17:55.000
So to bring it all together and something understandable,

17:55.000 --> 17:57.000
a bunch of weeks to do.

17:57.000 --> 17:59.000
Thankfully, EDP is a big one day AI.

17:59.000 --> 18:03.000
And they said, yeah, go ahead and do find figure it out.

18:03.000 --> 18:08.000
So it was very, it was nice to be able to just spend time digging into this

18:08.000 --> 18:11.000
and coming out with something that at least excites me.

18:11.000 --> 18:15.000
Because now when I'm talking about AI, I know what I'm talking about.

18:15.000 --> 18:18.000
I'll admit to you, when I study a topic,

18:18.000 --> 18:21.000
I not good on the surface level.

18:21.000 --> 18:24.000
I want to understand all the way down to the bottom.

18:24.000 --> 18:26.000
Like if I'm working on a computer problem,

18:26.000 --> 18:31.000
I want to understand the code, I want to understand the operating system,

18:31.000 --> 18:37.000
how virtual memory works, how the CPU works, the IO subsystem all the way down.

18:37.000 --> 18:39.000
And when you understand that whole stack,

18:39.000 --> 18:41.000
if you have a problem and you're trying to optimize,

18:41.000 --> 18:43.000
you understand what's down there.

18:43.000 --> 18:46.000
So again, I need to understand what was down there at the bottom.

18:46.000 --> 18:49.000
So how do we do?

18:49.000 --> 18:55.000
We've heard the concept of AI training probably for two years now.

18:55.000 --> 18:56.000
What does it mean?

18:56.000 --> 18:59.000
What does it mean when they tell you they're going to do that?

18:59.000 --> 19:04.000
I went to a conference in...

19:04.000 --> 19:07.000
Well, I'll eat, you know, show it.

19:07.000 --> 19:10.000
Southeast Linux Fest, and there was an AI talking to me as the guy.

19:10.000 --> 19:12.000
How do they train these models?

19:12.000 --> 19:15.000
Well, they kind of take some data,

19:15.000 --> 19:18.000
and they put it in like, well, how do they actually do anything?

19:18.000 --> 19:21.000
Yeah, he couldn't really answer.

19:21.000 --> 19:23.000
So I'm going to try and answer that.

19:23.000 --> 19:24.000
Okay.

19:24.000 --> 19:27.000
So the way you do it is you take

19:27.000 --> 19:31.000
you basically create a vector for every English word,

19:31.000 --> 19:34.000
or for every word in whatever language you want,

19:34.000 --> 19:37.000
or if you're using a lot of foreign words,

19:37.000 --> 19:40.000
you use like what they call byte pairs.

19:40.000 --> 19:43.000
It's actually tokens made up of two letter combinations.

19:43.000 --> 19:45.000
So I'm not going to get into that.

19:45.000 --> 19:48.000
I'm just going to pretend we're doing English words,

19:48.000 --> 19:51.000
but of course when you start to talk about proper names,

19:51.000 --> 19:54.000
they're going to be also to different combination letters,

19:54.000 --> 19:55.000
so you kind of got to do both,

19:55.000 --> 19:57.000
or you got to do one of the other.

19:57.000 --> 20:00.000
But let's just assume we're going to create a vector

20:00.000 --> 20:02.000
for every word in the English language.

20:02.000 --> 20:05.000
And that's probably 30 to 50,000 vectors.

20:05.000 --> 20:07.000
Okay, we're going to start to do simple.

20:07.000 --> 20:10.000
I'm just trying to highlight if there's another way to doing it.

20:10.000 --> 20:11.000
Okay.

20:11.000 --> 20:15.000
Then I'm going to assign each vector the same length or magnitude,

20:15.000 --> 20:18.000
or all going to be the same distance from the center.

20:18.000 --> 20:19.000
Okay.

20:19.000 --> 20:23.000
So remember the illustration I had every vector was at the center,

20:23.000 --> 20:25.000
and they all were the same distance.

20:25.000 --> 20:28.000
So that's like kind of a constant that you have to have.

20:28.000 --> 20:30.000
And this is kind of crazy.

20:30.000 --> 20:33.000
I'm going to assign each vector a random direction

20:33.000 --> 20:37.000
in the 1 to 12,000 address space.

20:37.000 --> 20:39.000
Now, why would you do that?

20:39.000 --> 20:42.000
Like why wouldn't you try to help it?

20:42.000 --> 20:45.000
Well, you got 30 to 50,000 words, let's say.

20:45.000 --> 20:47.000
Or maybe more.

20:47.000 --> 20:48.000
Okay.

20:48.000 --> 20:51.000
Do you have a vector space?

20:51.000 --> 20:58.000
It's 10 to the 9,000 or 10 to the 118,000.

20:58.000 --> 21:01.000
Does it matter randomly where you put it?

21:01.000 --> 21:03.000
Probably not.

21:03.000 --> 21:08.000
But you're just swimming in a vector space.

21:08.000 --> 21:11.000
There's just so much room in there that we can't even

21:11.000 --> 21:13.000
fathom how much room is in there.

21:13.000 --> 21:14.000
Okay.

21:14.000 --> 21:15.000
At least that's where we are today.

21:15.000 --> 21:18.000
And I'm not saying someday we may figure out a better way.

21:18.000 --> 21:21.000
Right now, we just randomly throw them in there.

21:21.000 --> 21:22.000
Okay.

21:22.000 --> 21:23.000
Now, this is the key.

21:23.000 --> 21:25.000
This is the key to the section.

21:25.000 --> 21:30.000
What we do once we randomly give them a location.

21:30.000 --> 21:35.000
We adjust the direction of each vector using a mass number of training documents.

21:35.000 --> 21:41.000
For each word, we adjust its vector to be closer to the vectors of surrounding words.

21:41.000 --> 21:45.000
Or, for each word, we adjust the vectors of the surrounding words

21:45.000 --> 21:46.000
to be closer to its vector.

21:46.000 --> 21:48.000
Again, we have a lot of URLs there.

21:48.000 --> 21:50.000
Let me be specific.

21:50.000 --> 21:51.000
Okay.

21:51.000 --> 21:52.000
Here is my example.

21:52.000 --> 21:57.000
This is a very classic example that every AI seems to use.

21:57.000 --> 21:59.000
A clean man woman.

21:59.000 --> 22:00.000
All right.

22:00.000 --> 22:02.000
And there's a specific reason they chose this example.

22:02.000 --> 22:04.000
I think.

22:04.000 --> 22:06.000
But I'm going to work through that.

22:06.000 --> 22:09.000
And again, this is a great URL.

22:09.000 --> 22:11.000
Everything I put in blue is great to me.

22:11.000 --> 22:15.000
Anyway, because I spent a huge amount of time finding these really detailed things.

22:15.000 --> 22:16.000
Okay.

22:16.000 --> 22:19.000
So, let's look example.

22:19.000 --> 22:21.000
Where this is my training data.

22:21.000 --> 22:22.000
The king was a tall man.

22:22.000 --> 22:24.000
That's my training data.

22:24.000 --> 22:25.000
Okay.

22:25.000 --> 22:26.000
I got that off a web page.

22:26.000 --> 22:28.000
I got that off a chat bar.

22:28.000 --> 22:29.000
I got that out of a movie.

22:29.000 --> 22:31.000
I don't know where I got it from.

22:31.000 --> 22:33.000
The king was a tall man.

22:33.000 --> 22:40.000
So, either I'm going to move the closer to king was a tall man.

22:40.000 --> 22:45.000
Or I'm going to move king was a tall man closer to king.

22:45.000 --> 22:51.000
And then the second step, I'm going to move king closer to the was a tall man.

22:51.000 --> 22:55.000
Or I'm going to move the was a tall man closer to king.

22:55.000 --> 22:57.000
Do you see the difference?

22:57.000 --> 22:58.000
Okay.

22:58.000 --> 23:01.000
And I'm going to do that for all the words.

23:01.000 --> 23:03.000
Right?

23:03.000 --> 23:06.000
So, specifically king is getting closer to man.

23:06.000 --> 23:09.000
Now, I'm going to, let's now, here's my other training data.

23:10.000 --> 23:12.000
Here's my new training text.

23:12.000 --> 23:13.000
The king was a tall man.

23:13.000 --> 23:15.000
The queen was a beautiful woman.

23:15.000 --> 23:18.000
They sat together in the throne room of the castle.

23:18.000 --> 23:19.000
Okay.

23:19.000 --> 23:20.000
It's just very, very simple.

23:20.000 --> 23:24.000
I'm going to throw out some of the words that are like minor like the.

23:24.000 --> 23:25.000
Right?

23:25.000 --> 23:26.000
That's not really key for me.

23:26.000 --> 23:29.000
King is as I'm training here.

23:29.000 --> 23:31.000
King is going to get closer to man.

23:31.000 --> 23:34.000
Paul is going to get closer to man and king.

23:34.000 --> 23:35.000
Queen is getting closer to woman.

23:35.000 --> 23:37.000
Beautiful is getting closer to woman and queen.

23:37.000 --> 23:38.000
Queen.

23:38.000 --> 23:39.000
Throne is getting closer to castle.

23:39.000 --> 23:44.000
And they're all getting closer to throne castle because we're spanning sentences as we're

23:44.000 --> 23:46.000
doing this training.

23:46.000 --> 23:47.000
All right.

23:47.000 --> 23:53.000
Again, the example though is going to be closer to king, throne to room.

23:53.000 --> 23:55.000
And now look at throne room.

23:55.000 --> 23:59.000
Now because there's a close together, throne room becomes closer.

23:59.000 --> 24:01.000
And that gets closer to castle.

24:01.000 --> 24:04.000
So it's because there's always adjustment going on.

24:04.000 --> 24:05.000
Okay.

24:06.000 --> 24:12.000
So if we look at this example in our final output, what we want.

24:12.000 --> 24:14.000
And again, this is two dimensions.

24:14.000 --> 24:20.000
We have to think in a thousand dimensions or 12,000 dimensions.

24:20.000 --> 24:26.000
So in one of those 12,000 or 1000 dimensions.

24:26.000 --> 24:29.000
King is going to be close to queen.

24:29.000 --> 24:33.000
In another one of those dimensions, man is going to be close to woman.

24:33.000 --> 24:35.000
And then Throne is going to be close.

24:35.000 --> 24:41.000
And ideally, the distance between man and woman is the same distance at this between king

24:41.000 --> 24:42.000
and queen.

24:42.000 --> 24:48.000
So you can do something like king minus man plus woman gives us queen.

24:48.000 --> 24:49.000
Okay.

24:49.000 --> 24:56.000
And you can start to see the semantic use of AI here.

24:56.000 --> 24:58.000
Because we've trained on so much data.

24:58.000 --> 25:02.000
When we normally do tax search and post graphs, we just have the word king.

25:02.000 --> 25:03.000
We have the word man.

25:03.000 --> 25:04.000
We have the word queen.

25:04.000 --> 25:05.000
We have the word woman.

25:05.000 --> 25:08.000
And we're just looking for specific word matches.

25:08.000 --> 25:17.000
But here, because we're training vectors, each word is getting a thousand parameter adjustment.

25:17.000 --> 25:21.000
The potentially be close to another word it's related to.

25:21.000 --> 25:27.000
So each word that can be related to multiple words independently.

25:27.000 --> 25:28.000
Right?

25:28.000 --> 25:35.000
When we move man closer to woman, we're not necessarily moving it farther away from king.

25:35.000 --> 25:36.000
Catch that.

25:36.000 --> 25:40.000
When we move man closer to woman, we're not moving it necessarily farther away.

25:40.000 --> 25:47.000
Because the king closeness to man is probably on a different dimension than the closeness

25:47.000 --> 25:50.000
to man to woman.

25:50.000 --> 25:54.000
Because we have a thousand or 12,000 dimension.

25:55.000 --> 25:57.000
This is just kind of what a text embedding looks like.

25:57.000 --> 26:01.000
I'm just kind of giving you an example of visually what it was.

26:01.000 --> 26:03.000
Remember we have a thousand squares.

26:03.000 --> 26:05.000
I don't even think there's a thousand on here.

26:05.000 --> 26:10.000
And each one has a floating point number in it.

26:10.000 --> 26:11.000
Right?

26:11.000 --> 26:13.000
So it's pretty powerful.

26:13.000 --> 26:14.000
Okay.

26:14.000 --> 26:15.000
Any questions?

26:15.000 --> 26:16.000
Yes.

26:16.000 --> 26:17.000
Sure.

26:17.000 --> 26:18.000
Yeah.

26:19.000 --> 26:26.000
So the question is, is there difference between the two training methods?

26:26.000 --> 26:31.000
One is word devak and one is a different name.

26:31.000 --> 26:33.000
If you, I'm going to talk about it in a minute.

26:33.000 --> 26:36.000
There is a URL I have which explains which one is better.

26:36.000 --> 26:39.000
So one of them is better for a certain purpose.

26:39.000 --> 26:41.000
The other one's better for a different purpose.

26:41.000 --> 26:42.000
Yeah.

26:42.000 --> 26:43.000
Great question though.

26:43.000 --> 26:44.000
Yeah.

26:44.000 --> 26:45.000
Anyone understood?

26:45.000 --> 26:48.000
Yes sir.

26:48.000 --> 26:50.000
What do I mean by moving closer?

26:50.000 --> 26:51.000
I'm going to show that right here.

26:51.000 --> 26:52.000
Okay.

26:52.000 --> 26:53.000
Great.

26:53.000 --> 26:56.000
Specifically mathematically what moving closer looks like.

26:56.000 --> 26:58.000
It's actually really cool.

26:58.000 --> 26:59.000
Great.

26:59.000 --> 27:00.000
I'm getting through with questions.

27:00.000 --> 27:01.000
I'm excited.

27:01.000 --> 27:02.000
Okay.

27:02.000 --> 27:03.000
Great.

27:03.000 --> 27:04.000
Okay. Oh, sorry.

27:04.000 --> 27:05.000
Samanic vector search.

27:05.000 --> 27:06.000
Okay. Again.

27:06.000 --> 27:09.000
Postgres is supported full text and free search in 2003.

27:09.000 --> 27:11.000
But again, it's word-based.

27:11.000 --> 27:12.000
There's no context.

27:13.000 --> 27:18.000
There's no meaning to the words and there's no relationship between words or how to relate it.

27:18.000 --> 27:20.000
There's no training involved.

27:20.000 --> 27:21.000
You can use synonyms.

27:21.000 --> 27:23.000
But again, it's only ad hoc.

27:23.000 --> 27:24.000
Okay.

27:24.000 --> 27:28.000
But as you can see this ability to do,

27:28.000 --> 27:30.000
Samanic search is really cool.

27:30.000 --> 27:33.000
And you can even mix Samanic and full text search together.

27:33.000 --> 27:37.000
If you want to again, nice URLs here, kind of going over that.

27:37.000 --> 27:38.000
Okay.

27:38.000 --> 27:40.000
So Samanic vector set up.

27:40.000 --> 27:41.000
How do we do it, right?

27:41.000 --> 27:42.000
How do we adjust them?

27:42.000 --> 27:43.000
Great question.

27:43.000 --> 27:44.000
How do we do?

27:44.000 --> 27:45.000
What's difference between the two of them?

27:45.000 --> 27:51.000
So we download what we call pre-trained text and baggage or we create our own.

27:51.000 --> 27:53.000
And then we chunk.

27:53.000 --> 27:58.000
We chunk the size of our settings.

27:58.000 --> 28:00.000
And then we add, okay, hold on.

28:00.000 --> 28:02.000
Let me just make sure.

28:02.000 --> 28:08.000
Maybe I should back.

28:08.000 --> 28:09.000
Okay.

28:09.000 --> 28:11.000
So let me just back up to that question.

28:11.000 --> 28:14.000
You said, what does it mean to bring the vectors close to each other?

28:14.000 --> 28:17.000
Just to, I'm going to be more specific about what you're saying.

28:17.000 --> 28:23.000
What basically happens is that when we're doing training data,

28:23.000 --> 28:26.000
we do what's called a vector product.

28:27.000 --> 28:31.000
The vector product effectively involves us multiplying the two vectors together.

28:31.000 --> 28:37.000
And we get a new vector, which is effectively between the two vectors that we have.

28:37.000 --> 28:38.000
Okay.

28:38.000 --> 28:42.000
But what's really interesting about it is that the way you do the dot product,

28:42.000 --> 28:52.000
it has a tendency to overemphasize the dimension where those two vectors are closest.

28:52.000 --> 29:00.000
So if I have a thousand dimensions, and two, one of those dimensions of those two vectors is closer than the other ones.

29:00.000 --> 29:10.000
When I do the dot product in a thousand dimensions, my new vector pointer has a tendency to overemphasize the vector that's actually more similar.

29:10.000 --> 29:16.000
That's why when you're doing gig and queed and man and woman and man and king and king and queen.

29:16.000 --> 29:20.000
There's different dimensions where a man and woman are closer, different dimensions where king and queen.

29:20.000 --> 29:24.000
And when you do the dot product, it sees king and queen together in some text.

29:24.000 --> 29:31.000
It moves mostly the dimension where they're closer closer, but it doesn't touch as many of the other dimensions.

29:31.000 --> 29:37.000
And that's why you still have queen when you're having king and queen moving together,

29:37.000 --> 29:42.000
it's not necessarily moving queen and woman farther apart.

29:42.000 --> 29:46.000
Kind of crazy, but that's the way vector product works.

29:46.000 --> 29:52.000
Okay, so let's take a look at, it's a great question.

29:52.000 --> 29:56.000
Let's take a look at how we do an actual semantic search.

29:56.000 --> 29:58.000
So we can kind of see what this looks like.

29:58.000 --> 30:05.000
Okay, so effectively what we're going to do is we're going to download some vectors and we pre-trained them.

30:05.000 --> 30:10.000
So we have man, we have a vector for that, we have woman and vector, we have queen and vector.

30:10.000 --> 30:13.000
30,000 words, 50,000 words.

30:13.000 --> 30:20.000
That that data's all been trained and these vectors are all kind of near each other in different dimensions.

30:20.000 --> 30:25.000
Right, remember man and woman are close in one of those thousand dimensions.

30:25.000 --> 30:30.000
Man and king are closer, king and queen are closer and different dimensions.

30:30.000 --> 30:33.000
King and castle are closer and a different dimension.

30:33.000 --> 30:36.000
Right, that's kind of really amazing how it works.

30:36.000 --> 30:41.000
And then we can effectively choose a chunk size and we chunk our query together.

30:41.000 --> 30:43.000
Whatever we're doing, okay.

30:43.000 --> 30:46.000
We find a text embedding vectors for all our words.

30:46.000 --> 30:48.000
We add it to the vectors then we store them in the database.

30:48.000 --> 30:50.000
So what does that affect the way mean?

30:50.000 --> 30:53.000
So we find a text embedding for the vectors from training.

30:53.000 --> 31:01.000
In this case, like open AI or llama from Facebook or open, well actually I don't think.

31:01.000 --> 31:03.000
Deepseek is not open.

31:03.000 --> 31:06.000
And there's a different open ones and some closed ones.

31:06.000 --> 31:14.000
And then what we do is we take the query, we average all the text embeddings for the query that sentence.

31:14.000 --> 31:17.000
And we get one result vector, okay.

31:17.000 --> 31:20.000
And then we search for the closest match.

31:20.000 --> 31:25.000
Right, and I'm going to walk through an exact example of this to give you an idea of exactly what's going on.

31:25.000 --> 31:27.000
So here's a great one.

31:27.000 --> 31:31.000
I assume you wish to index the document using some message.

31:31.000 --> 31:33.000
The sentence says the king was a tall man.

31:33.000 --> 31:34.000
The queen is a beautiful woman.

31:34.000 --> 31:35.000
They sat together in the throne.

31:35.000 --> 31:36.000
Okay.

31:36.000 --> 31:38.000
So we choose the chunk size of being sentences.

31:38.000 --> 31:40.000
So the king was a tall man.

31:40.000 --> 31:43.000
That's six vectors, six embedding vectors.

31:43.000 --> 31:45.000
We're going to average those six together.

31:45.000 --> 31:47.000
We're going to store them in a database.

31:47.000 --> 31:49.000
We're going to take queen is a beautiful woman.

31:49.000 --> 31:52.000
Six vectors, average them together, store them in a database.

31:52.000 --> 31:54.000
Sat together in the throne room.

31:54.000 --> 31:57.000
Ten vectors, average them together, store them in the database.

31:57.000 --> 31:58.000
Okay.

31:58.000 --> 32:01.000
Now, who is the king?

32:01.000 --> 32:02.000
Who is the king?

32:02.000 --> 32:06.000
We take the four words.

32:06.000 --> 32:08.000
We average them together.

32:08.000 --> 32:16.000
And then we say which stored in bedding is closest to who is the king?

32:16.000 --> 32:18.000
All right.

32:18.000 --> 32:21.000
Remember, we're in a thousand dimensions here.

32:21.000 --> 32:27.000
But the way the search works in postgres, you can just say give me the closest vector to this thing.

32:27.000 --> 32:31.000
And turns out the king was a tall man and ends up being hit, which is great.

32:31.000 --> 32:32.000
Okay.

32:32.000 --> 32:33.000
How do we do this?

32:33.000 --> 32:34.000
I won't actually show you some code.

32:34.000 --> 32:39.000
And in fact, if you're really curious, you can download this SQL file and run it yourself.

32:39.000 --> 32:40.000
Okay.

32:40.000 --> 32:41.000
So have fun.

32:41.000 --> 32:43.000
I download PNG vector.

32:43.000 --> 32:50.000
I actually created table for all of my content and I create a table for my embeddings.

32:50.000 --> 32:51.000
Okay.

32:51.000 --> 32:53.000
And then I got that from here.

32:53.000 --> 32:54.000
Okay.

32:54.000 --> 32:56.000
Then I insert the king as a tall man.

32:56.000 --> 32:57.000
Beautiful woman.

32:57.000 --> 32:58.000
They sit in the throne room.

32:58.000 --> 32:59.000
I create an index.

32:59.000 --> 33:01.000
And then I write a python script.

33:01.000 --> 33:04.000
And the python script effectively connects to open AI.

33:04.000 --> 33:06.000
It pulls the embeddings.

33:06.000 --> 33:10.000
And then for every word in the table.

33:10.000 --> 33:11.000
Okay.

33:11.000 --> 33:13.000
Every word in my document.

33:13.000 --> 33:19.000
I insert an vector which average the words in the sense.

33:19.000 --> 33:20.000
All right.

33:20.000 --> 33:21.000
Everyone with me.

33:21.000 --> 33:23.000
Remember I told you they take the word.

33:23.000 --> 33:24.000
The king's a tall man.

33:24.000 --> 33:26.000
I find the vectors for all six words.

33:26.000 --> 33:29.000
I average those vectors using dot product.

33:29.000 --> 33:31.000
And then I store that in the database.

33:31.000 --> 33:32.000
Okay.

33:32.000 --> 33:34.000
So that's exactly what I'm doing here.

33:34.000 --> 33:37.000
And effectively here's the document.

33:37.000 --> 33:38.000
Okay.

33:38.000 --> 33:39.000
And here are the embeddings.

33:39.000 --> 33:41.000
This is only three dimensions.

33:41.000 --> 33:42.000
This goes to 1,000.

33:42.000 --> 33:43.000
Okay.

33:43.000 --> 33:45.000
1100 I think or something like that.

33:45.000 --> 33:46.000
Okay.

33:46.000 --> 33:49.000
And each of the dimensions is four bytes of a float.

33:49.000 --> 33:51.000
And here's my query table.

33:51.000 --> 33:54.000
Effectively what I do is I take a query.

33:54.000 --> 33:56.000
I pass it to open AI.

33:56.000 --> 33:58.000
I ask for the embedding.

33:58.000 --> 34:00.000
And then I search for the closest.

34:00.000 --> 34:01.000
I search for all my embeddings.

34:01.000 --> 34:04.000
And I order them by their closeness.

34:04.000 --> 34:05.000
Okay.

34:05.000 --> 34:07.000
So who is the king?

34:07.000 --> 34:09.000
Who is the king?

34:09.000 --> 34:17.000
And the answer becomes which embedding a vector that I've averaged for my document's closest.

34:18.000 --> 34:21.000
The answer is the king was a tall man.

34:21.000 --> 34:22.000
Okay.

34:22.000 --> 34:24.000
Who is tall?

34:24.000 --> 34:25.000
The king is a tall man.

34:25.000 --> 34:26.000
That.

34:26.000 --> 34:27.000
This is pretty easy.

34:27.000 --> 34:28.000
This is basic.

34:28.000 --> 34:30.000
We could do this with full text search.

34:30.000 --> 34:31.000
Okay.

34:31.000 --> 34:33.000
Who is sure?

34:33.000 --> 34:37.000
Now, short doesn't appear anywhere in our system.

34:37.000 --> 34:40.000
And what's odd is the winner is the king is a tall man.

34:40.000 --> 34:42.000
What's going on here?

34:42.000 --> 34:43.000
Okay.

34:43.000 --> 34:46.000
It turns out the tall and short are kind of related.

34:47.000 --> 34:50.000
So it thinks that's the closest.

34:50.000 --> 34:51.000
Right?

34:51.000 --> 34:53.000
Because look at the numbers point 68.

34:53.000 --> 34:55.000
These are farther away.

34:55.000 --> 34:56.000
Okay.

34:56.000 --> 34:57.000
Who is pretty?

34:57.000 --> 34:58.000
That's cool.

34:58.000 --> 35:00.000
Pretty and beautiful are associated together.

35:00.000 --> 35:01.000
So what knows.

35:01.000 --> 35:03.000
The closest is the queen.

35:03.000 --> 35:05.000
Who is in the palace?

35:05.000 --> 35:08.000
I don't have the word palace in my text.

35:08.000 --> 35:11.000
But it knows the palace and castle are related.

35:11.000 --> 35:12.000
Okay.

35:12.000 --> 35:13.000
So I have.

35:13.000 --> 35:15.000
They sit together in the throne.

35:16.000 --> 35:17.000
Okay.

35:17.000 --> 35:19.000
Where is the chair?

35:19.000 --> 35:21.000
And use the word chair.

35:21.000 --> 35:22.000
Did I?

35:22.000 --> 35:25.000
But it knows that the throne is a chair.

35:25.000 --> 35:27.000
Because of the embeddings that it learned.

35:27.000 --> 35:28.000
Okay.

35:28.000 --> 35:30.000
So they sit together in the throne room.

35:30.000 --> 35:32.000
Is the winner that?

35:32.000 --> 35:34.000
You can see how.

35:34.000 --> 35:36.000
This is a little different.

35:36.000 --> 35:38.000
This quite a different than a full text search.

35:38.000 --> 35:42.000
Because it has understanding of how words are related to each other.

35:42.000 --> 35:45.000
It does not happen when you're just looking at letters.

35:45.000 --> 35:46.000
Okay.

35:46.000 --> 35:50.000
Any questions?

35:50.000 --> 35:56.000
Yes sir.

35:56.000 --> 35:58.000
So the question is, do you need open AI to do that?

35:58.000 --> 36:00.000
In this case you do.

36:00.000 --> 36:02.000
But I could have used llama.

36:02.000 --> 36:04.000
I could have used anything from hugging face.

36:04.000 --> 36:05.000
What a work just fine.

36:05.000 --> 36:08.000
Only reason I did this to make it super simple.

36:08.000 --> 36:10.000
So you don't have to download and embedding.

36:10.000 --> 36:11.000
You don't have to.

36:11.000 --> 36:14.000
Because sometimes they can be like couple gigabytes.

36:14.000 --> 36:16.000
And I'm like, let's just make it simple.

36:16.000 --> 36:17.000
Okay.

36:17.000 --> 36:18.000
I did open AI.

36:18.000 --> 36:19.000
And we couldn't use anything.

36:19.000 --> 36:21.000
In fact, you can host this yourself.

36:21.000 --> 36:23.000
You don't need to use anybody else.

36:23.000 --> 36:24.000
I just used open AI.

36:24.000 --> 36:26.000
Other questions?

36:26.000 --> 36:27.000
Great.

36:27.000 --> 36:28.000
Thank you.

36:28.000 --> 36:29.000
Okay.

36:29.000 --> 36:30.000
Great.

36:30.000 --> 36:31.000
Okay.

36:31.000 --> 36:33.000
So I talked about.

36:33.000 --> 36:35.000
I talked about.

36:35.000 --> 36:37.000
I talked about.

36:37.000 --> 36:39.000
The actor or semantic search.

36:39.000 --> 36:42.000
But I haven't talked about generative AI yet.

36:42.000 --> 36:43.000
Okay.

36:43.000 --> 36:49.000
Now I'm going to talk about generative AI because it builds on these vectors that we created.

36:49.000 --> 36:50.000
Okay.

36:50.000 --> 36:54.000
And what I'm going to talk about is effectively something called transformers.

36:54.000 --> 36:57.000
Transformers are effectively a part of attention blocks.

36:57.000 --> 36:59.000
I want to work with attention blocks.

36:59.000 --> 37:06.000
And I'm going to try and show you how these in text embedding with your trained on billions of documents.

37:06.000 --> 37:10.000
Effectively, allow you to generate text or generate images.

37:10.000 --> 37:14.000
In this case, I'm just going to give you text that's simple enough.

37:14.000 --> 37:16.000
Images would be a different talk.

37:16.000 --> 37:18.000
But I'm going to just talk about text.

37:18.000 --> 37:19.000
Okay.

37:19.000 --> 37:22.000
So, how do we do generative AI?

37:22.000 --> 37:24.000
We did semantic search.

37:24.000 --> 37:25.000
Let's do generative AI.

37:25.000 --> 37:33.000
First, we load what we call an attention block with the text embedding vectors of the words used in the user query.

37:33.000 --> 37:35.000
I'll show you what that looks like in a minute.

37:35.000 --> 37:39.000
Then we adjust the vectors to be closer to previous words.

37:39.000 --> 37:40.000
Remember how we adjusted.

37:40.000 --> 37:43.000
You asked about adjusting the, that we're going to adjust these.

37:43.000 --> 37:44.000
Okay.

37:44.000 --> 37:48.000
And then we're going to repeat the step to several times.

37:48.000 --> 37:51.000
And then what's really weird is once we're done.

37:51.000 --> 37:58.000
The last vector of the attention block is our first word that we're going to give to the user.

37:58.000 --> 38:00.000
And then we run attention block.

38:00.000 --> 38:01.000
Again, we get the second word.

38:01.000 --> 38:04.000
Have you ever used like an AI system and you ask a question.

38:04.000 --> 38:08.000
Have you ever noticed the words come out like one at a time?

38:08.000 --> 38:10.000
Anybody knows that?

38:10.000 --> 38:12.000
That's not my accident.

38:12.000 --> 38:15.000
Because that's the way the attention block is generating it.

38:15.000 --> 38:17.000
It doesn't generate the whole sentence at once.

38:17.000 --> 38:21.000
It generates each word and cycle through the attention block again to get the next word.

38:21.000 --> 38:23.000
And I'm going to show you exactly how that works.

38:23.000 --> 38:24.000
Okay.

38:24.000 --> 38:25.000
Exactly.

38:25.000 --> 38:29.000
You repeat step two to continue for successive words.

38:29.000 --> 38:30.000
Okay.

38:30.000 --> 38:33.000
So, we're going to show something called self-attention blocks.

38:33.000 --> 38:36.000
There is something also called cross-attention blocks.

38:36.000 --> 38:39.000
Cross-attention blocks are used for translation typically.

38:39.000 --> 38:43.000
But where do you self-attention blocks to generate new words?

38:43.000 --> 38:47.000
And again, I'm simplifying what's going on.

38:47.000 --> 38:49.000
I'm not showing you a lot of the complication.

38:49.000 --> 38:51.000
GPUs are key in this.

38:51.000 --> 38:54.000
And again, if you want more details, feel free to look at the URLs.

38:54.000 --> 38:55.000
Okay.

38:55.000 --> 38:58.000
So, again, they can be thousands of dimensions.

38:58.000 --> 39:01.000
Typically, the attention block is only 128 dimensions.

39:01.000 --> 39:04.000
But they don't need a thousand dimensions to do.

39:04.000 --> 39:07.000
Attention block, which is great.

39:07.000 --> 39:10.000
The sum models use what's called multi-shot learning.

39:10.000 --> 39:12.000
I'm not going to cover that.

39:12.000 --> 39:16.000
I'm not going to talk about word position, sentence-sendings, or sentence-constructions.

39:16.000 --> 39:18.000
That's a more detail.

39:18.000 --> 39:20.000
I don't, I can't get into that.

39:20.000 --> 39:22.000
Again, feel free to look down here if you want more detail.

39:22.000 --> 39:23.000
Okay.

39:23.000 --> 39:24.000
So, here's what we're going to walk through.

39:24.000 --> 39:29.000
The question is, what is the capital of France?

39:30.000 --> 39:34.000
All right? We don't know the answer, but again, we want it to tell the answer.

39:34.000 --> 39:37.000
So, what we're going to do is we're going to take the word what.

39:37.000 --> 39:42.000
And we're going to look it up in the text and batting.

39:42.000 --> 39:44.000
All right?

39:44.000 --> 39:47.000
Then we're going to take is the capital of France.

39:47.000 --> 39:49.000
And we're going to look those up in the text and batting.

39:49.000 --> 39:52.000
And we're going to fill in something called attention block.

39:52.000 --> 39:58.000
So, each of these words is literally 128 dimension vector,

39:58.000 --> 40:00.000
which came out of my attention block.

40:00.000 --> 40:02.000
There's something called dimension reduction.

40:02.000 --> 40:05.000
So, the thousand dimensions becomes 128 dimensions.

40:05.000 --> 40:07.000
I'm not going to claim that I can't.

40:07.000 --> 40:08.000
Okay?

40:08.000 --> 40:10.000
So, what is the capital of France?

40:10.000 --> 40:12.000
So, we're going to take what.

40:12.000 --> 40:17.000
And we're going to adjust is to be closer to what?

40:17.000 --> 40:20.000
That's my first step.

40:20.000 --> 40:26.000
Then I'm going to take, and effectively, this is the way we do self-attention block.

40:26.000 --> 40:27.000
We only look backward.

40:27.000 --> 40:28.000
Okay?

40:28.000 --> 40:30.000
So, I'm going to take though.

40:30.000 --> 40:33.000
And I'm going to make it closer to what it is.

40:33.000 --> 40:34.000
Okay?

40:34.000 --> 40:35.000
I'm going to take capital.

40:35.000 --> 40:37.000
I'm going to make it closer to what it is.

40:37.000 --> 40:40.000
I'm going to take of, make it closer to what it is.

40:40.000 --> 40:42.000
What is the capital?

40:42.000 --> 40:43.000
And I'm going to take France.

40:43.000 --> 40:46.000
I'm going to move it closer to what is the capital of.

40:46.000 --> 40:49.000
That's my first attention block.

40:49.000 --> 40:51.000
Okay?

40:51.000 --> 40:53.000
And I'm going to do the question mark.

40:53.000 --> 40:56.000
The item is the way I've done it.

40:56.000 --> 41:01.000
And again, we're going to move each vector closer to the previous words.

41:01.000 --> 41:02.000
We normalize them.

41:02.000 --> 41:04.000
It's not uniform.

41:04.000 --> 41:09.000
Again, this backdrop again happens is I get a word.

41:09.000 --> 41:17.000
And it turns out that when my last run through the attention block, I get a final vector,

41:17.000 --> 41:22.000
the final vector, if I'm going through attention block, maybe 10, 15 times, is the first word.

41:22.000 --> 41:25.000
I'm going to return to the user.

41:25.000 --> 41:27.000
I don't know why.

41:27.000 --> 41:28.000
It doesn't make sense.

41:28.000 --> 41:30.000
But think about it.

41:30.000 --> 41:32.000
What is the capital of France?

41:32.000 --> 41:37.000
If I think of the important words there, there's France and there's capital, probably.

41:37.000 --> 41:47.000
And probably 10 or 15 times, I've moved France and capital, I've moved France closer to capital, right?

41:47.000 --> 41:49.000
Over and over again.

41:49.000 --> 41:55.000
And as France is moving closer to capital, it's getting closer to ours.

41:55.000 --> 41:56.000
Okay?

41:56.000 --> 42:00.000
So effectively, I now have my word and then I keep going.

42:00.000 --> 42:02.000
And I get my next word.

42:02.000 --> 42:06.000
And effectively, word one is that signal.

42:06.000 --> 42:07.000
I flaggle through it again.

42:07.000 --> 42:08.000
I get word two.

42:08.000 --> 42:11.000
And I just keep doing that until I get my result.

42:11.000 --> 42:13.000
My full sentence returned.

42:13.000 --> 42:21.000
Because when you end, when you end the system, when you end the sentence and when you are done reporting,

42:21.000 --> 42:22.000
I can't get into that.

42:22.000 --> 42:25.000
There's a lot of other things going on.

42:25.000 --> 42:29.000
But I'm trying to give you a fundamental understanding of what's happening there.

42:29.000 --> 42:30.000
Okay?

42:30.000 --> 42:31.000
Any questions?

42:31.000 --> 42:32.000
Yes, sir.

42:36.000 --> 42:42.000
So with attention blocks for user to write that, in fact, the way it works is that,

42:43.000 --> 42:46.000
every new query gets a completely new attention block.

42:46.000 --> 42:47.000
Okay?

42:47.000 --> 42:51.000
But as you're working through the query, what's really interesting,

42:51.000 --> 42:56.000
the first word is generated after you're going through nine times.

42:56.000 --> 42:59.000
So once you have that first word to get to the second word,

42:59.000 --> 43:01.000
you don't need to start over again.

43:01.000 --> 43:06.000
You just take that word and you compare it to all the previous blocks.

43:06.000 --> 43:08.000
And that gives you your next word.

43:08.000 --> 43:13.000
So the mathematics of, do I need to run the whole attention block for every word?

43:13.000 --> 43:15.000
No, absolutely not.

43:15.000 --> 43:19.000
And that also is covered in that big video series I talked about and explained why.

43:19.000 --> 43:24.000
But effectively, the first word, once you run that first word,

43:24.000 --> 43:31.000
attention wise through all the previous vectors, the next word actually just comes out.

43:31.000 --> 43:32.000
Crazy?

43:32.000 --> 43:34.000
Yes, it works.

43:34.000 --> 43:35.000
Amazing.

43:35.000 --> 43:37.000
Other questions?

43:37.000 --> 43:38.000
Great question.

43:38.000 --> 43:39.000
I had the exact same question.

43:39.000 --> 43:40.000
Okay.

43:40.000 --> 43:42.000
So let's walk through this.

43:42.000 --> 43:47.000
Let's actually look at some examples because you may not believe me or you may think I'm crazy or whatever.

43:47.000 --> 43:48.000
Okay?

43:48.000 --> 43:50.000
So what is the capital of France?

43:50.000 --> 43:53.000
It tells me the capital of France is powerful.

43:53.000 --> 43:54.000
Okay?

43:54.000 --> 43:58.000
Again, there's a sense that you've primed this with, like, a previous sentence.

43:58.000 --> 44:01.000
Again, normally you would think Paris would be the first word,

44:01.000 --> 44:05.000
but it kind of has a structure of answering a question.

44:05.000 --> 44:08.000
Again, again, I'm not going to get into that.

44:08.000 --> 44:11.000
The capital of France is, I can ask the same way.

44:11.000 --> 44:13.000
I get the same answer.

44:13.000 --> 44:14.000
Where is Paris?

44:14.000 --> 44:15.000
That's kind of interesting.

44:15.000 --> 44:17.000
I get a whole nice description there.

44:17.000 --> 44:18.000
All right?

44:18.000 --> 44:19.000
That cool?

44:19.000 --> 44:23.000
So it knew the difference between, like, the two cases.

44:23.000 --> 44:24.000
Okay?

44:24.000 --> 44:25.000
This is a good one.

44:25.000 --> 44:27.000
Columbus is in Ohio.

44:27.000 --> 44:29.000
Where is Paris?

44:29.000 --> 44:34.000
Again, remember the earlier words reflect change the later words, right?

44:34.000 --> 44:37.000
Remember the extension blocks.

44:37.000 --> 44:40.000
So Paris is in France.

44:40.000 --> 44:44.000
It says, also, there's a city named Paris in the United States,

44:44.000 --> 44:46.000
located in Texas.

44:46.000 --> 44:48.000
So if you're referring to that, please specify.

44:48.000 --> 44:50.000
So it's trying to give you a hint.

44:50.000 --> 44:54.000
You mentioned Ohio, maybe they want some US stuff here.

44:54.000 --> 44:57.000
Now, if I say, I live in Texas where is Paris?

44:57.000 --> 44:59.000
Now, it gets kind of interesting.

44:59.000 --> 45:02.000
It actually starts talking about Paris, Texas first.

45:02.000 --> 45:05.000
Then it talks about Paris France.

45:05.000 --> 45:11.000
You understand how the attention factors are changing what's going to come out at the end.

45:11.000 --> 45:14.000
Any questions?

45:14.000 --> 45:16.000
Yes.

45:16.000 --> 45:19.000
How does it know what to stop?

45:19.000 --> 45:22.000
Okay, so I can't really cover that.

45:22.000 --> 45:26.000
You can specify whether you want a long answer or short answer, actually,

45:26.000 --> 45:30.000
as part of the chat GPT or the chat AI system.

45:31.000 --> 45:34.000
The whole sentence structure is something I'm not covering,

45:34.000 --> 45:38.000
because I didn't understand it would be a whole different talk.

45:38.000 --> 45:43.000
I'm already feel like I'm being aggressive or ambitious.

45:43.000 --> 45:45.000
So you could question, no.

45:45.000 --> 45:49.000
Okay, so here's something called retrieval augmented generation.

45:49.000 --> 45:55.000
If you've ever heard of that, it effectively allows you to pre-fix a query with some information.

45:55.000 --> 45:58.000
All right, now that you've sent my sounds stupid, it might sound useless.

45:58.000 --> 46:01.000
But once I show you some examples, I think it'll be interesting.

46:01.000 --> 46:05.000
So retrieval augmented generation, you can say, where's Paris?

46:05.000 --> 46:11.000
Okay, now if you pre-fix your question with reply briefly,

46:11.000 --> 46:15.000
all of a sudden you get a much shorter response.

46:15.000 --> 46:16.000
Okay?

46:16.000 --> 46:21.000
How about if I do, I can use it with a little program called RAD.

46:21.000 --> 46:26.000
I can say reply briefly, and what effectively happens when I send the information to chat GPT.

46:26.000 --> 46:32.000
I have a special system role, which says pre-fix my query with this description.

46:32.000 --> 46:35.000
That's effectively how open AI works.

46:35.000 --> 46:36.000
Okay?

46:36.000 --> 46:39.000
So, bonuses in Ohio, where's Paris?

46:39.000 --> 46:41.000
Bingo, I get city of Paris.

46:41.000 --> 46:43.000
I live in Texas, where's Paris again?

46:43.000 --> 46:47.000
I get, if you meant different city, Paris, Texas starts to talk about that.

46:47.000 --> 46:48.000
Okay?

46:48.000 --> 46:50.000
Here's kind of cool.

46:50.000 --> 46:54.000
If I'm using the following RAD data for each query,

46:54.000 --> 46:55.000
okay?

46:55.000 --> 46:57.000
So now I'm using some SQL data.

46:57.000 --> 46:59.000
This is a JSON document.

46:59.000 --> 47:01.000
Again, I don't want to spend too much time on it.

47:01.000 --> 47:02.000
I'm running out of time.

47:02.000 --> 47:06.000
But effectively what I'm doing here is I'm giving you some SQL data.

47:06.000 --> 47:12.000
And now I can ask questions from to the chat GPT using my SQL data.

47:12.000 --> 47:15.000
So I can pre-fix all my queries with this.

47:15.000 --> 47:18.000
And I can say, when were the orders made?

47:19.000 --> 47:20.000
I get nose.

47:20.000 --> 47:22.000
It can read this.

47:22.000 --> 47:23.000
Understand it.

47:23.000 --> 47:27.000
And it can give me a human answer to the question.

47:27.000 --> 47:28.000
Okay?

47:28.000 --> 47:29.000
When is the earliest invoice?

47:29.000 --> 47:31.000
It knows that too.

47:31.000 --> 47:33.000
What is the totally invoices?

47:33.000 --> 47:34.000
It knows that.

47:34.000 --> 47:37.000
In fact, it shows me how I computed it.

47:37.000 --> 47:39.000
What is the average invoice amount?

47:39.000 --> 47:41.000
It computes that for me easily.

47:41.000 --> 47:44.000
Okay?

47:44.000 --> 47:46.000
What is, this is great.

47:46.000 --> 47:47.000
Today's April 15th.

47:47.000 --> 47:50.000
What's the 38th payment expected in which invoice they're overdue?

47:50.000 --> 47:52.000
It does that for me.

47:52.000 --> 47:53.000
Cool.

47:53.000 --> 47:55.000
Okay?

47:55.000 --> 47:58.000
What is the totally invoices that are not pencils?

47:58.000 --> 47:59.000
Cool.

47:59.000 --> 48:00.000
Okay?

48:00.000 --> 48:04.000
So you can start to see how I pre-fix my query with some data.

48:04.000 --> 48:05.000
Maybe coming from the database.

48:05.000 --> 48:08.000
I could actually do some cool stuff.

48:08.000 --> 48:09.000
Okay?

48:09.000 --> 48:10.000
Today's April.

48:10.000 --> 48:11.000
What's the next expected order date?

48:11.000 --> 48:14.000
It actually kind of looks at the pattern of orders.

48:14.000 --> 48:18.000
It was in figures out like when the next expected one is, kind of cool.

48:18.000 --> 48:21.000
I can even generate SQL from chatGb3.

48:21.000 --> 48:23.000
I'm cool.

48:23.000 --> 48:24.000
Okay?

48:24.000 --> 48:26.000
I can pull data from the database.

48:26.000 --> 48:33.000
So again, I can even run SQL and have it pull information and automatically query that kind of cool.

48:33.000 --> 48:36.000
Again, and this is what I wanted to kind of alluded to in the beginning.

48:36.000 --> 48:38.000
We're only a couple years old.

48:38.000 --> 48:41.000
Now Google originally invented this.

48:41.000 --> 48:47.000
In 2013, there's a word to VAC paper they could put out.

48:47.000 --> 48:50.000
And that really started it.

48:50.000 --> 48:54.000
And then another paper for Google in 2017 for attention blocks.

48:54.000 --> 49:04.000
But what happened was Google was so focused on text search that they never really saw the idea of pulling data together from different places.

49:04.000 --> 49:06.000
And that's why chatGb3.

49:06.000 --> 49:09.000
They had, they did the groundbreaking research,

49:09.000 --> 49:13.000
they focused so much on revenue generating activities like web search and advertising.

49:13.000 --> 49:17.000
They kind of missed the boat and they're playing catch up, which is kind of surprising.

49:17.000 --> 49:19.000
But that's the way it is.

49:19.000 --> 49:22.000
I'm going to finish with two quotes.

49:22.000 --> 49:23.000
Okay?

49:23.000 --> 49:24.000
I know two minutes over.

49:24.000 --> 49:25.000
First quote.

49:25.000 --> 49:26.000
I'm just going to read it.

49:26.000 --> 49:27.000
Second quote.

49:27.000 --> 49:28.000
I'm going to read it.

49:28.000 --> 49:32.000
Just trying to have both quotes in your head at the same time.

49:32.000 --> 49:33.000
First quote.

49:33.000 --> 49:34.000
Here I go.

49:34.000 --> 49:38.000
If you had said to me in our lifetime, we would have the capabilities like we had today.

49:38.000 --> 49:39.000
Now, chatGb4.

49:39.000 --> 49:42.000
If you ask me, that here I go.

49:42.000 --> 49:45.000
If you explain the kinds of things to chatGb4.

49:45.000 --> 49:47.000
I probably would have told you a year ago.

49:47.000 --> 49:51.000
I don't know if we have this capabilities in our lifetime.

49:51.000 --> 49:54.000
And now we have it today.

49:54.000 --> 49:58.000
So the speed at which this is moving is staggering.

49:58.000 --> 50:00.000
2023.

50:00.000 --> 50:01.000
Okay?

50:01.000 --> 50:02.000
That's one approach.

50:02.000 --> 50:04.000
Second approach.

50:04.000 --> 50:10.000
We are used to the idea that people or entities that can express themselves or manipulate language

50:10.000 --> 50:11.000
are smart.

50:11.000 --> 50:12.000
That is not true.

50:12.000 --> 50:16.000
You can manipulate language and not be smart.

50:16.000 --> 50:20.000
And that's basically what large language models are demonstrating.

50:20.000 --> 50:21.000
2024.

50:21.000 --> 50:22.000
Okay?

50:22.000 --> 50:23.000
Both are true.

50:23.000 --> 50:25.000
I would say.

50:25.000 --> 50:26.000
And that is the end.

50:26.000 --> 50:27.000
I appreciate.

50:27.000 --> 50:28.000
I'm sorry.

50:28.000 --> 50:29.000
I went over time three minutes.

50:29.000 --> 50:31.000
But I do appreciate taking questions.

50:31.000 --> 50:32.000
I am over time.

50:32.000 --> 50:34.000
So this lady will decide when I'm finished.

50:34.000 --> 50:36.000
I will stay here as long as possible.

50:36.000 --> 50:38.000
If anybody has questions.

50:38.000 --> 50:47.000
I very nice presentation.

50:47.000 --> 50:50.000
I really like the last quote.

50:50.000 --> 50:55.000
You say you're giving a presentation in Amsterdam or not.

50:55.000 --> 50:56.000
Right?

50:56.000 --> 50:58.000
When and where is it?

50:58.000 --> 50:59.000
Maybe.

50:59.000 --> 51:02.000
So I'm not presenting an Amsterdam.

51:02.000 --> 51:05.000
I'm just attending the Postgres Meetup in Amsterdam.

51:05.000 --> 51:10.000
So I think I came too late to get on the schedule basically.

51:10.000 --> 51:13.000
So I make it like a lightning talk or something.

51:13.000 --> 51:14.000
It's at JetBrains.

51:14.000 --> 51:16.000
And if you go to meetup.com, you'll see it.

51:16.000 --> 51:17.000
Great.

51:17.000 --> 51:21.000
At some.

51:21.000 --> 51:26.000
Please be quiet at the back.

51:26.000 --> 51:28.000
We're still asking questions here.

51:28.000 --> 51:29.000
All right.

51:29.000 --> 51:33.000
So at some point you showed the chunking process.

51:33.000 --> 51:36.000
So we can chunk by word by sentence.

51:36.000 --> 51:39.000
By paragraph sometimes or the entire article.

51:39.000 --> 51:40.000
That's right.

51:40.000 --> 51:44.000
Is there any strategies or approaches documented so far that

51:44.000 --> 51:47.000
that showed that chunking this way is better than that way?

51:47.000 --> 51:48.000
Yes.

51:48.000 --> 51:53.000
So the slide I talked about chunking has a URL at the bottom,

51:53.000 --> 51:57.000
which talks about the various chunking.

51:57.000 --> 52:04.000
I'm trying to get to it here.

52:04.000 --> 52:05.000
I think it's here.

52:05.000 --> 52:06.000
Here.

52:06.000 --> 52:07.000
Here.

52:07.000 --> 52:08.000
Right here.

52:08.000 --> 52:10.000
Guided chunking strategies.

52:10.000 --> 52:11.000
Guided.

52:11.000 --> 52:13.000
Document chunking for AI applications.

52:13.000 --> 52:14.000
That is where you would love.

52:14.000 --> 52:15.000
Yeah.

52:15.000 --> 52:16.000
All right.

52:16.000 --> 52:17.000
Thank you.

52:17.000 --> 52:19.000
Because there's different trade-offs for different chunking.

52:19.000 --> 52:20.000
Yeah.

52:20.000 --> 52:21.000
Great question.

52:21.000 --> 52:22.000
I had the same question.

52:23.000 --> 52:25.000
Yes, nice.

52:25.000 --> 52:27.000
We're getting a little extra time.

52:27.000 --> 52:28.000
Thanks.

52:28.000 --> 52:31.000
So my question is about hallucinations.

52:31.000 --> 52:33.000
So they do.

52:33.000 --> 52:36.000
I get it now a bit better from your explanation.

52:36.000 --> 52:42.000
But could you also give yourself an explanation about how hallucinations happen?

52:42.000 --> 52:46.000
Or they seem to be a feature of the.

52:46.000 --> 52:47.000
Oh, yeah.

52:47.000 --> 52:48.000
That's better.

52:48.000 --> 52:51.000
So it's the questions about hallucinations.

52:52.000 --> 52:53.000
Yeah.

52:53.000 --> 52:54.000
Yeah.

52:54.000 --> 52:57.000
So the reason hallucinations happen.

52:57.000 --> 53:04.000
Is actually related to this.

53:04.000 --> 53:07.000
This, I'm sorry.

53:07.000 --> 53:11.000
This is it.

53:11.000 --> 53:14.000
It's not that hallucinations are something wrong.

53:14.000 --> 53:20.000
It's our expectation that the AI has any idea of what it's saying.

53:20.000 --> 53:21.000
Right?

53:21.000 --> 53:22.000
Any idea at all?

53:22.000 --> 53:24.000
It's our fault.

53:24.000 --> 53:27.000
In a way, it's our fault that we think.

53:27.000 --> 53:29.000
This is fundamentally the issue.

53:29.000 --> 53:30.000
All right.

53:30.000 --> 53:32.000
I'm not saying this is in true as well.

53:32.000 --> 53:33.000
Okay.

53:33.000 --> 53:40.000
But effectively, we're projecting what we think as logical on to something that has no idea what it's saying.

53:40.000 --> 53:46.000
And in fact, I'll try to show you specifically how mathematics generates the effects.

53:46.000 --> 53:48.000
But there's no knowledge of what it is.

53:48.000 --> 53:53.000
The only knowledge is effectively those text embeddings that it learns from other documents.

53:53.000 --> 53:54.000
Right?

53:54.000 --> 53:58.000
So hallucination is something never going to go away because it doesn't know.

53:58.000 --> 54:02.000
Like, if my kids start answering the way some of these, you know, like, you think there's something

54:02.000 --> 54:03.000
that only wrong with them, right?

54:03.000 --> 54:05.000
But again, this is not a child.

54:05.000 --> 54:07.000
This is a mathematical thing.

54:07.000 --> 54:09.000
And it has no idea what it's saying.

54:09.000 --> 54:11.000
They're continuing to kind of fine tune it.

54:11.000 --> 54:14.000
But effectively, it doesn't know when it's never going to know.

54:14.000 --> 54:24.000
I'm going to have to let her control it, guys.

54:24.000 --> 54:26.000
I'm sorry.

54:26.000 --> 54:29.000
Thank you for the talk.

54:29.000 --> 54:38.000
All these things are about metrics, multiplication, or metrics, vector multiplication, etc.

54:38.000 --> 54:47.000
And by digital computers, what about analog computers for vector multiplications, or metric multiplications?

54:47.000 --> 54:50.000
It could be much faster.

54:50.000 --> 54:54.000
Yeah, I have to never, I never got into that.

54:54.000 --> 55:00.000
Only because I was trying to come at this from a database perspective, being a database postgres guy.

55:00.000 --> 55:05.000
I wanted to kind of talk about having understood the foundations where it is postgres fit in.

55:05.000 --> 55:07.000
That's a different space.

55:07.000 --> 55:10.000
I can't, I don't, I never read anything about that.

55:10.000 --> 55:11.000
I'm sorry.

55:11.000 --> 55:12.000
No problem.

55:26.000 --> 55:28.000
I appreciate your coming to the last session.

55:28.000 --> 55:29.000
That was nice.

55:29.000 --> 55:31.000
Thank you.

55:31.000 --> 55:33.000
Yeah, it's just a quick one.

55:33.000 --> 55:42.000
When you showed how to do vector retrieval on postgres, you had one example that was kind of counterintuitive,

55:42.000 --> 55:47.000
where you were asking who is short, and you got the sentence that King is told.

55:47.000 --> 55:51.000
And of course, this is because it's just retrieving information semantically close, right?

55:51.000 --> 55:52.000
Right.

55:52.000 --> 55:57.000
So if you can end up with this result, I'm wondering, like, what are good use cases for this?

55:57.000 --> 56:00.000
Because you're clearly not here, it seems you're.

56:00.000 --> 56:07.000
Undesignated or free Mac, and it knows to come back and say, it's called the unassigned Mac block.

56:07.000 --> 56:09.000
And here's what it is.

56:09.000 --> 56:19.000
So in that case, I couldn't think, I couldn't remember the word for unreserved or, you know, whatever, unassigned.

56:19.000 --> 56:23.000
But Jackie B.T. could see, I talked about Mac ofgresses.

56:23.000 --> 56:29.000
I'm talking about that and it could kind of give me an answer that it found somewhere that was more.

56:29.000 --> 56:32.000
relevant to my question than Google could.

56:32.000 --> 56:37.000
Another case, flying here, I'm getting confused between the ticketing airline,

56:37.000 --> 56:40.000
the co-chair airline, and the operating airline.

56:40.000 --> 56:42.000
And I go to search.

56:42.000 --> 56:43.000
Three of them.

56:43.000 --> 56:45.000
And I get websites about co-chair.

56:45.000 --> 56:47.000
I get websites about operating airline.

56:47.000 --> 56:49.000
But I don't get anything together.

56:49.000 --> 56:51.000
I go to chat to you, B.T.

56:51.000 --> 56:56.000
I said, tell me about ticketing airline, co-chair airline, operating airline.

56:56.000 --> 57:00.000
It gives me a nice three bullet point description of all three of them.

57:00.000 --> 57:04.000
So in the first case, it took what I lied.

57:04.000 --> 57:07.000
Use vectors to find something close to unused.

57:07.000 --> 57:09.000
And mute was unreserved.

57:09.000 --> 57:12.000
That was the term and then gave me the result in the second case.

57:12.000 --> 57:14.000
It brought information from different websites.

57:14.000 --> 57:16.000
One website was about co-chair.

57:16.000 --> 57:17.000
One was about ticketing.

57:17.000 --> 57:19.000
One was about operating.

57:19.000 --> 57:21.000
It's very complicated the way tickets were.

57:21.000 --> 57:23.000
And it could kind of pull that together for me.

57:23.000 --> 57:25.000
I think that's really the value.

57:25.000 --> 57:26.000
You're right.

57:26.000 --> 57:28.000
And it can do stupid stuff, too.

57:28.000 --> 57:29.000
So that's never going to end.

57:29.000 --> 57:30.000
I'm afraid.

57:30.000 --> 57:31.000
Yes, thank you.

57:31.000 --> 57:32.000
I'm sorry.

57:32.000 --> 57:33.000
Yes, sir.

57:33.000 --> 57:34.000
Hello.

57:34.000 --> 57:40.000
So my question basically is that we also using vector in postgres.

57:40.000 --> 57:43.000
And we are using pdvector.

57:43.000 --> 57:50.000
So the question is, is there a plan to have a native vector data type in postgres?

57:50.000 --> 57:54.000
Because we are lost in the space of plugins.

57:54.000 --> 57:55.000
You don't know what the index is.

57:55.000 --> 57:56.000
It was used.

57:56.000 --> 57:58.000
So I'm right.

57:58.000 --> 58:01.000
So, you know, that's a great question.

58:01.000 --> 58:07.000
When we brought full text search into postgres, it actually was a contrib module.

58:07.000 --> 58:08.000
By the Russians.

58:08.000 --> 58:11.000
And then eventually it was added to postgres.

58:11.000 --> 58:14.000
And then once it was integrated, there were some cool stuff we could do.

58:14.000 --> 58:20.000
The beauty of what we have here is that when full text when AI became important,

58:20.000 --> 58:22.000
we had pdvector already.

58:22.000 --> 58:23.000
It was already there.

58:23.000 --> 58:25.000
So we don't like geniuses.

58:25.000 --> 58:27.000
Because you don't have to do anything with postgres.

58:27.000 --> 58:29.000
Just all the instant you've done.

58:29.000 --> 58:30.000
Same thing happened with data warehouse.

58:30.000 --> 58:32.000
Same thing happened with full text search.

58:32.000 --> 58:34.000
I think happened with the cloud.

58:34.000 --> 58:38.000
We'd have to do much because we could just kind of change what we needed.

58:38.000 --> 58:44.000
So with pdvector, I think I like the fact that it's one of different development cycle.

58:44.000 --> 58:50.000
So as because the AI space is so volatile, I kind of like the fact they can release more often

58:50.000 --> 58:52.000
than postgres does.

58:52.000 --> 58:53.000
Right?

58:53.000 --> 58:58.000
And improve it independent and then we, all you have to do is follow the extension.

58:58.000 --> 59:04.000
So that, for I like whether the technology will eventually solidify,

59:04.000 --> 59:06.000
where we want to move it in, that may be a case.

59:06.000 --> 59:07.000
I don't know when that would be.

59:07.000 --> 59:09.000
Great question.

59:09.000 --> 59:24.000
I think this will be the last question.

59:24.000 --> 59:29.000
So these searches seem quite expensive clearly.

59:29.000 --> 59:33.000
I know someone asked for a bite chunking and someone asked when does it stop.

59:33.000 --> 59:38.000
I'm wondering if you have any like quick insights of ways that you can search for.

59:38.000 --> 59:42.000
The ways that you can search that make that less expensive or make it.

59:42.000 --> 59:43.000
Yeah.

59:43.000 --> 59:55.000
So I think when I was studying this before deep seek, which came out this month week.

59:55.000 --> 01:00:00.000
When I was studying this, it was pretty clear that some of this was overkill.

01:00:00.000 --> 01:00:01.000
Right?

01:00:01.000 --> 01:00:12.000
You're at 1,000, 12,000 dimensions like the vector space is unbelievably big.

01:00:12.000 --> 01:00:13.000
Right?

01:00:13.000 --> 01:00:18.000
And a lot of times what happens when you bring a new technology is you don't know what to optimize.

01:00:18.000 --> 01:00:20.000
I'm thinking of the early compilers.

01:00:20.000 --> 01:00:21.000
Right?

01:00:21.000 --> 01:00:24.000
Like we have compilers now that are unbelievably good.

01:00:24.000 --> 01:00:30.000
Like you know, I'm a postgres guy and sometimes I'll compile something and I'll look at the assembly and I'll be like.

01:00:31.000 --> 01:00:35.000
In a million years, I've never coded the way they did.

01:00:35.000 --> 01:00:36.000
Okay?

01:00:36.000 --> 01:00:42.000
But the compiler people know like the order how the registers work, what can be pipeline, what can be proud.

01:00:42.000 --> 01:00:43.000
I mean it's genius.

01:00:43.000 --> 01:00:44.000
Okay?

01:00:44.000 --> 01:00:46.000
GCC, all these guys are amazing.

01:00:46.000 --> 01:00:52.000
What I think we have now is the space like where compilers were in the 60s.

01:00:52.000 --> 01:00:55.000
They're just trying to get code that works, right?

01:00:55.000 --> 01:00:57.000
There's no optimization going on.

01:00:57.000 --> 01:01:04.000
What I think we'll see in the next couple of years is a sense that maybe we don't need to be doing some of this stuff we're doing.

01:01:04.000 --> 01:01:05.000
Okay?

01:01:05.000 --> 01:01:17.000
And I think deep seek, although there is concern that some of the training stuff they're doing has used training from other other open AI and other,

01:01:18.000 --> 01:01:25.000
that actually is cheating, but in terms of the way they do attention blocks and the way they actually do the query part, there are actually some optimizations.

01:01:25.000 --> 01:01:28.000
At least I understand I watched a couple of videos on it.

01:01:28.000 --> 01:01:32.000
It seems like some of the optimizations they're doing could be used by everybody.

01:01:32.000 --> 01:01:40.000
So I think, although this is quite expensive now, I believe that over time this is going to become much more streamlined.

01:01:40.000 --> 01:01:42.000
I think we will have as many issues.

01:01:42.000 --> 01:01:44.000
That answers your question.

01:01:45.000 --> 01:01:48.000
That's hard to say, but it's fascinating.

01:01:48.000 --> 01:02:02.000
Again, the three pieces are the embedding search, the rag, and then the generative part and how databases for me, you know, having understood how all works I can understand where data base fits in that space.

01:02:02.000 --> 01:02:03.000
So thank you very much.

01:02:03.000 --> 01:02:05.000
Appreciate it for staying for the last session.

01:02:05.000 --> 01:02:06.000
Thank you very much.

01:02:06.000 --> 01:02:10.000
Thank you so much for the vain commitment presentation.

01:02:14.000 --> 01:02:16.000
Thank you very much.

