WEBVTT

00:00.000 --> 00:16.000
All right, everybody, we're going to get started.

00:16.000 --> 00:18.000
Please be quiet if you want to talk.

00:18.000 --> 00:20.000
Do it outside.

00:20.000 --> 00:24.000
Be respectful of the speakers, please.

00:24.000 --> 00:29.000
Hello, everybody, please be quiet because Charlie's going to give us a run-down

00:29.000 --> 00:34.000
of what happens under the hood when you actually execute an SQL select statement.

00:34.000 --> 00:35.000
Hey, the way.

00:35.000 --> 00:38.000
Thank you.

00:38.000 --> 00:39.000
Hey, everybody.

00:39.000 --> 00:43.000
It's just one better or the other one better.

00:43.000 --> 00:44.000
The other one?

00:44.000 --> 00:46.000
Okay.

00:46.000 --> 00:48.000
Okay, it's similar to that.

00:48.000 --> 00:49.000
I'm possible, right?

00:49.000 --> 00:53.000
So they think that they's going to gamble with your wish.

00:53.000 --> 00:56.000
The planer is going to do some bats.

00:56.000 --> 00:58.000
It's going to get some informations.

00:58.000 --> 01:05.000
It's going to get some coins and going to bat with your wish.

01:05.000 --> 01:10.000
And then the executor at the end is going to pay for the bill.

01:10.000 --> 01:13.000
Will it pay a smaller or a larger bill?

01:13.000 --> 01:14.000
How does that work?

01:14.000 --> 01:17.000
This is what we're going to see today.

01:17.000 --> 01:20.000
So this is our example bat.

01:20.000 --> 01:23.000
We're going to run this select.

01:23.000 --> 01:30.000
If you see we have four tables, three joints, two filters, and one soft limit.

01:30.000 --> 01:35.000
It's not the very simplistic select, but it's not the very complex select either.

01:35.000 --> 01:40.000
So it's something that you can use to see how that's going to play together, right?

01:40.000 --> 01:46.000
So when you send this to the database, this was supposed to be an animation.

01:46.000 --> 01:48.000
See?

01:48.000 --> 01:50.000
Okay, let's run again.

01:50.000 --> 01:56.000
So when you send this to the database, it goes to the pass, then to the analyzer,

01:56.000 --> 01:59.000
re-writer, planer, executor, and then the result.

01:59.000 --> 02:05.000
So help each because you're going to see each of them going to do here.

02:05.000 --> 02:08.000
So everything starts from the pass.

02:08.000 --> 02:14.000
So the pass is the guy that's going to get your query and see if everything is in order.

02:14.000 --> 02:16.000
It doesn't care if the table exists now.

02:16.000 --> 02:19.000
It doesn't care if the columns exist or not.

02:19.000 --> 02:24.000
But it cares if it has an order, just like when you speak an English.

02:24.000 --> 02:28.000
When you say, I would like to, you say, I would like to.

02:28.000 --> 02:31.000
You don't say, like would I do to do something, right?

02:31.000 --> 02:33.000
So it has order.

02:33.000 --> 02:39.000
It's need to say if you write front, not for, or M, F, or R.

02:39.000 --> 02:41.000
So that's what I'm going to do.

02:41.000 --> 02:44.000
And then it's going to create a tree.

02:45.000 --> 02:48.000
Just like the one that we see here, this is the pass in three.

02:48.000 --> 02:55.000
So this pass in three, going to follow our curve along the way, because it's going to be re-use it layer.

02:55.000 --> 03:00.000
This is the starting of the internal language of the database.

03:00.000 --> 03:04.000
We're going to use force with an example here, but that happens in almost all the database.

03:04.000 --> 03:07.000
At least the ones that I know.

03:07.000 --> 03:12.000
So after that, we need to do a semantics analysis.

03:12.000 --> 03:16.000
Okay, now we really care if they will exist or not.

03:16.000 --> 03:19.000
So we need to check if those tables exist.

03:19.000 --> 03:22.000
We need to check if the comparison that we're doing there,

03:22.000 --> 03:27.000
we're using the right operator for the correct times.

03:27.000 --> 03:33.000
Sometimes you want to do a select where something was equals something else.

03:33.000 --> 03:35.000
But we're comparing different types.

03:35.000 --> 03:39.000
We're comparing an integer with a date or something else.

03:39.000 --> 03:44.000
That depending on your database and the syntax is it's not allowed.

03:44.000 --> 03:53.000
Then you need to do either change the types or you need to do an in place change for them.

03:53.000 --> 03:57.000
So this is what this guy is going to do.

03:57.000 --> 04:00.000
So and then it create an audit tree.

04:00.000 --> 04:06.000
If we use the tree that we sent before, a pass tree, it's going to create a query tree.

04:06.000 --> 04:12.000
So these are those where you could rewrite some inserts inside of the bills.

04:12.000 --> 04:18.000
So these are all of those things that are going to be to change it here.

04:18.000 --> 04:24.000
All right, now we have most of the things that we need to execute.

04:24.000 --> 04:26.000
We have a plan now.

04:26.000 --> 04:33.000
So but before we execute, we need to decide how we're going to execute.

04:33.000 --> 04:37.000
We know which are the tables we want to select.

04:37.000 --> 04:43.000
We know which are the roles we want to select, the kind and no because we have the referrals.

04:43.000 --> 04:47.000
And you know the max number of roles we want to get.

04:47.000 --> 04:49.000
Remember we have a limit 100 here.

04:49.000 --> 04:54.000
So we want to cap our result when it gets to 100 roles.

04:54.000 --> 04:55.000
We don't want more.

04:55.000 --> 04:58.000
We may get less, but we don't want more of those ones.

04:58.000 --> 05:01.000
So this is the information that the database has.

05:01.000 --> 05:04.000
But now it needs to make a few of decisions.

05:04.000 --> 05:07.000
So the first decision that it needs to do.

05:07.000 --> 05:10.000
Here is how to scan.

05:10.000 --> 05:12.000
Then how to join.

05:12.000 --> 05:15.000
And the order and if you do some part of the list.

05:15.000 --> 05:19.000
And to do those things to make those decisions.

05:19.000 --> 05:23.000
This is where come the odds for the our batch.

05:23.000 --> 05:26.000
The decisions that are made by based on course.

05:26.000 --> 05:28.000
Remember we want to pay the lesser bill.

05:28.000 --> 05:29.000
The lesser value.

05:29.000 --> 05:32.000
The lesser money to run our select.

05:32.000 --> 05:34.000
But how do we calculate that course?

05:34.000 --> 05:36.000
How do they calculate that course?

05:36.000 --> 05:41.000
So we have some unities that the database is going to use.

05:41.000 --> 05:46.000
And some of them, how do we get the data?

05:46.000 --> 05:48.000
Do we get the data sequentially?

05:48.000 --> 05:54.000
Or do we get the data in a random way inside of the disk?

05:54.000 --> 05:56.000
Why does it matter?

05:56.000 --> 06:00.000
So in all days when a user to use is speeding this.

06:00.000 --> 06:05.000
Get in the data sequentially is a lot faster than randomly.

06:05.000 --> 06:07.000
Because the speed in the disk.

06:07.000 --> 06:10.000
The head of the disk for one place or another took a lot of time.

06:10.000 --> 06:12.000
And if you install your course risk well right now.

06:12.000 --> 06:14.000
And you go for the configuration.

06:14.000 --> 06:16.000
You see this parameter here.

06:16.000 --> 06:18.000
Random page cost equals four.

06:18.000 --> 06:23.000
It's telling the database that if you need to take a random page.

06:23.000 --> 06:29.000
It's going to cost four times a stake in one sequential page.

06:29.000 --> 06:32.000
Or give it, why does it matter?

06:32.000 --> 06:37.000
Well, it matters because your index is basically random access on the database.

06:37.000 --> 06:42.000
And your table, full table scan is sequential access on the database.

06:42.000 --> 06:46.000
So if you have a table and in that table have an index.

06:46.000 --> 06:51.000
And if the number of roles compared that's read from the index on the table is similar

06:51.000 --> 06:56.000
or four times lesser the database is not going to use your index.

06:56.000 --> 06:58.000
It's going to do a full table scan.

06:58.000 --> 07:02.000
And that's why sometimes you have this really nice index here in my table.

07:02.000 --> 07:05.000
And if it is a lack, it's not using my index.

07:05.000 --> 07:07.000
Because it's expensive to the database.

07:07.000 --> 07:10.000
That's one of the reasons, but they only one.

07:10.000 --> 07:12.000
So we have all the parameters as well.

07:12.000 --> 07:15.000
So we have the CPU couple costs.

07:15.000 --> 07:20.000
That is how much going to cost to the database to process each role that you're going to select.

07:20.000 --> 07:25.000
So when you have these outdated statistics, those numbers might be wrong.

07:25.000 --> 07:29.000
And then you might be having full table scans instead of index scans.

07:29.000 --> 07:34.000
Not because it's cheaper, but because the polar things are believes it's cheaper.

07:34.000 --> 07:36.000
Based on the statistics, it gets.

07:36.000 --> 07:37.000
Right?

07:37.000 --> 07:39.000
We need to be careful on that one.

07:39.000 --> 07:43.000
So this is having an example here of a planning.

07:43.000 --> 07:48.000
This is what you get when you do an x-playing on post-grace quo.

07:48.000 --> 07:52.000
It's going to try to see, based on what you have, all the cost.

07:52.000 --> 07:54.000
What are the cost for my sequential scan?

07:54.000 --> 07:57.000
They index this scan, and so on and so forth.

07:57.000 --> 08:04.000
So we have the information that we put here and how it used on the explain.

08:04.000 --> 08:05.000
Right?

08:05.000 --> 08:08.000
And an example.

08:08.000 --> 08:11.000
And if you are familiar with post-grace quo,

08:11.000 --> 08:14.000
we have all the database they do the same as well.

08:14.000 --> 08:16.000
We have a few different types of explain.

08:16.000 --> 08:18.000
We have the explain and explain analyze.

08:18.000 --> 08:22.000
The explain is what the planer believes it's going to do.

08:22.000 --> 08:28.000
And the analyze it actually runs and shows what exactly it does.

08:28.000 --> 08:31.000
And this is the build that you're going to pay.

08:31.000 --> 08:34.000
Not the build that we believe we're going to pay.

08:34.000 --> 08:38.000
So on this way, we have to look at different things.

08:38.000 --> 08:40.000
Okay, sequential scan.

08:40.000 --> 08:44.000
Why or how we usually run sequential scans?

08:44.000 --> 08:45.000
Well, there's a matter.

08:45.000 --> 08:51.000
If you have a table that is too small or if you're going to get a lot more data from the table,

08:51.000 --> 08:56.000
compared to the total amount of the table, it's going to do a full table scan.

08:56.000 --> 08:58.000
Because it's cheaper.

08:58.000 --> 09:06.000
Remember, when we do a index scan, it needs to go to the index which is a random search.

09:06.000 --> 09:11.000
Again, the information then is still need to get the data from the table.

09:11.000 --> 09:14.000
Unless we do what we call an index only scan.

09:14.000 --> 09:18.000
If all the information you want is on the index, that's fine.

09:18.000 --> 09:19.000
There's a need to go to the table.

09:19.000 --> 09:24.000
For example, if you want, if you have an index that has the names.

09:24.000 --> 09:27.000
And they won't only to get the name, so all that information.

