WEBVTT

00:00.000 --> 00:09.600
Hello everyone, I'm Yakup Florek, a second year Bachelor of Computer Science Engineering

00:09.600 --> 00:13.560
student at UDEL, and last summer I was fortunate to participate in Google Summer of

00:13.560 --> 00:18.760
Code developing new Swift on qualified look-up implementation of my mentor, that Gregor.

00:18.760 --> 00:22.880
I'd like to tell you today a bit about the library of the developed and about the

00:22.880 --> 00:26.040
process and my experiences in newcomer to the community.

00:26.280 --> 00:30.080
First, I'd like to address the elephant in the room, and what I'm qualified actually

00:30.080 --> 00:31.080
is.

00:31.080 --> 00:34.040
And for that, I would like to refer to our common understanding of how name-sworking

00:34.040 --> 00:35.760
programming languages.

00:35.760 --> 00:39.920
So, in Swift, it's for example possible to refer to a variable introduced inside

00:39.920 --> 00:45.560
of the guard statement, or the variable declaration, could possibly shadow its predecessors,

00:45.560 --> 00:47.000
such as a member here.

00:47.000 --> 00:50.880
Speaking of members, we can access members that could possibly be introduced in a separate

00:50.880 --> 00:52.880
source-fix-separate module.

00:52.880 --> 00:56.960
Sometimes we even use variables that don't get any concrete syntax we could assign it

00:56.960 --> 01:02.480
to such a self-key word here, which is implicitly introduced on the function boundary.

01:02.480 --> 01:07.440
We understand those relations, but so does the compiler, and it does it through the unqualified

01:07.440 --> 01:08.440
look-up.

01:08.440 --> 01:13.400
And that's exactly what Swift-lexic look-up, the library of the developed, actually is.

01:13.400 --> 01:18.200
But in contrast to the compiler implementation, it doesn't build an auxiliary data structure,

01:18.200 --> 01:20.080
it's lightweight and stateless.

01:20.080 --> 01:23.600
And a syntax node can be used as an entry point for the query.

01:23.600 --> 01:28.200
And it then traverses the syntax tree from the starting point to the source, to the root

01:28.200 --> 01:32.560
of the source, to the top of the file, and collects all of the names, and partitions them

01:32.560 --> 01:34.920
according on the scope of introduction.

01:34.920 --> 01:39.920
Then returns this nice in-and-base data structure, which makes it really easy to model some

01:39.920 --> 01:43.440
corner cases, such as implicit names we have just seen.

01:43.440 --> 01:48.400
So let's discuss now a bit more concrete example of how such query could behave like.

01:48.400 --> 01:54.360
So let's perform our look-up here, and let's see what names and what scopes does the query

01:54.360 --> 01:55.960
traverse through.

01:55.960 --> 01:59.360
So first, called look, here is variable declaration.

01:59.360 --> 02:03.640
Then we have a guard statement with two names in it, and we are back to original code

02:03.640 --> 02:09.760
look, the function body with one more variable declaration, and then another guard statement.

02:09.760 --> 02:14.280
Now on the function boundary, we need to introduce the implicit cell, and now the interesting

02:14.280 --> 02:19.840
bit, because how can we represent the members that could possibly be introduced in a separate

02:19.840 --> 02:20.840
source file?

02:20.840 --> 02:25.560
Well, we can't reason about them with unqualified look-up, and we have more powerful

02:25.560 --> 02:31.760
member look-up to do that, but we can prompt clients to perform such look-up with what

02:31.760 --> 02:36.720
we call in-smooth-explicable look-up a result, which says clients to basically just go there,

02:36.720 --> 02:40.560
and try to find any names that could possibly be introduced there.

02:40.560 --> 02:44.680
It's almost the entire picture of how and call for it to look-up behaves in this case.

02:44.680 --> 02:49.400
There is one bit missing here, but we'll come back into in just a second.

02:49.400 --> 02:54.560
So having built an intuition of how the query should behave like, how do we ensure the

02:54.560 --> 02:56.760
quality of our implementation?

02:56.760 --> 02:58.960
So for that we've devised two kinds of tests.

02:58.960 --> 03:03.400
First, we have the unit tests, which are really cheap and easy to write.

03:03.640 --> 03:09.400
Standard XC test, which are inside of the switch syntax package, and they use our custom

03:09.400 --> 03:15.320
developed test harness that takes a code snippet, annotated with markers, and runs a lot

03:15.320 --> 03:17.200
of assertions for them.

03:17.200 --> 03:21.720
Let's try to make a sample test case for our earlier example.

03:21.720 --> 03:25.240
So first, we can simplify our code a bit.

03:25.240 --> 03:30.080
Now we can put it as a string to our test harness, and annotated with markers.

03:30.640 --> 03:31.960
Yes, there's very modgies.

03:31.960 --> 03:35.960
We're testing important pieces of the compiler of the modgies, right?

03:35.960 --> 03:40.640
And it's because now we can define our first entry point using one of those markers

03:40.640 --> 03:42.960
and start defining our assertions.

03:42.960 --> 03:48.200
This data structure that we used to do those assertions,

03:48.200 --> 03:52.920
closely follows what's with lexical cap producers and makes it really easy to write to a test.

03:52.920 --> 03:57.200
Now, let's do the also need to tell the test harness to not perform name matching,

03:57.200 --> 03:59.880
and that's done with a simple parameter.

03:59.880 --> 04:06.000
It just says to not filter out the name space on the identifier we started to look up with.

04:06.000 --> 04:11.120
So now we have a way to validate our implementation and go against our meant for model.

04:11.120 --> 04:15.120
But so how do we check if that's what the compiler and language expects?

04:15.120 --> 04:20.320
For that we have the second kind of the stintic ratio test, which run inside of the compiler,

04:20.320 --> 04:25.440
whenever a special experimental flag is set, it runs and records all of the names produced

04:25.440 --> 04:29.320
by the compiler implementation, and runs the equivalent kind of look-on with Swiftlexic

04:29.320 --> 04:32.680
look-up site, and tries to match the results.

04:32.680 --> 04:37.760
Whenever something goes wrong, we get those nice tables which show us exactly where our discrepancies,

04:37.760 --> 04:41.960
and those gray markers, we also account for some funky compiler behavior,

04:41.960 --> 04:45.400
because as it turns out, it's also not perfect.

04:45.400 --> 04:49.560
Now, my favorite part of the presentation, and I'd like to show you some interesting cases

04:49.560 --> 04:52.840
with encountered along the way when working on the library.

04:52.840 --> 04:58.240
So let's go back to the function body, callbook, or how we more generally call it in Swiftlexic

04:58.240 --> 05:00.600
look-up, as sequential scope.

05:00.600 --> 05:07.560
Particularly, listen to the second guard statement, because guard statements are weird.

05:07.560 --> 05:11.440
So you see, in the second condition, we have reference to the data variable,

05:11.440 --> 05:16.240
but it doesn't refer to the variable declaration maybe for the guard statement.

05:16.240 --> 05:21.080
First to the first condition, which has a variable declaration in it,

05:21.080 --> 05:25.440
which then refers to the variable declaration maybe for the guard statement.

05:25.440 --> 05:28.720
Sure, thanks, Sands.

05:28.720 --> 05:33.800
But what happens if we try to perform the same kind of look-up from inside of its body?

05:33.800 --> 05:36.440
Well, then we no longer refer to the first condition.

05:36.440 --> 05:40.400
We immediately refer to the variable declaration, made before the guard statement.

05:40.400 --> 05:44.040
And that's where guard statement are so weird and different from other scope,

05:44.040 --> 05:51.040
and require additional filtering logic during look-up and short-stripe behavior.

05:51.280 --> 05:55.920
There is also this one bit that I left out earlier, and it's because of what's going on here.

05:55.920 --> 06:01.680
Particularly, normally, no-minotype declarations made inside of the codebook are invisible

06:01.680 --> 06:05.200
across its entire scope, right?

06:05.200 --> 06:10.480
So with our entire intuition of just going towards the root of the tree,

06:10.480 --> 06:15.520
we need to revisit it for something like this, where we suddenly also need to check

06:15.520 --> 06:16.720
if there are any names.

06:16.720 --> 06:21.920
After that, or maybe there is also a notion of almost visible names that's required by

06:21.920 --> 06:22.920
it.

06:22.920 --> 06:27.760
Sequential scopes are easily the most complicated scope we model with, for example, look-up.

06:27.760 --> 06:32.280
Lastly, I also wanted to give you an example of a compiler mark.

06:32.280 --> 06:37.840
Namely, we're a compiler first introduced generic parameters, and then function parameters

06:37.840 --> 06:39.160
in macro declarations.

06:39.160 --> 06:44.400
Let's rather unexpected, and that's exactly the other way around how it works in, for example,

06:44.400 --> 06:46.280
function declarations.

06:46.280 --> 06:50.800
And I think this example beautifully explains why it's so hard to work with, those

06:50.800 --> 06:55.720
template annotations, because it's about constantly making those judgments, whether it's

06:55.720 --> 06:59.840
a bug on SwiftExical look-up site, or maybe something we just want to model differently

06:59.840 --> 07:05.480
there, or maybe a relatively harmless compiler bug that can be easily counted for by clients,

07:05.480 --> 07:11.320
or maybe something that could potentially lead to weird and unexpected behavior on compiler

07:11.320 --> 07:13.600
site, such as in this case.

07:14.160 --> 07:19.120
Lastly, I also wanted to give you some examples of how to contribute to this, but when

07:19.120 --> 07:24.800
working on the presentation, I realized I'd rather tell you how did I do it, and give you

07:24.800 --> 07:26.960
some concrete examples.

07:26.960 --> 07:31.280
And I really wanted this whole presentation to be about that, because we first tried to map

07:31.280 --> 07:35.600
our understanding of how names work in programming languages to what and qualified look-up

07:35.600 --> 07:37.440
could behave like.

07:37.440 --> 07:42.240
That's also when I started building my first demo, and writing my G Suite proposal.

07:42.240 --> 07:48.880
Then, after I started working with my inventor, look, we started implementing our first

07:48.880 --> 07:54.320
initial implementation, play this sort of test-driven development using our test harness,

07:54.320 --> 08:00.160
and subsequently, once we were able to finally model a great part of the language, we started

08:00.160 --> 08:05.040
testing it against the compiler, and that's really when the complexity just went through

08:05.040 --> 08:12.080
the roof, because I had to dive into a huge heterogeneous codebase, main Swift compiler codebase.

08:12.240 --> 08:17.760
Once we were able to run validation for an entire compilation process of sort of standard

08:17.760 --> 08:29.440
library recently, we started stabilizing the API, and I also published an RFC last week to hopefully

08:29.440 --> 08:38.320
make it public to enable clients to start using the library, and with the added benefit that

08:38.320 --> 08:45.200
once the new implementation makes its way to the compiler, and becomes the canonical implementation,

08:45.200 --> 08:48.160
the results will be guaranteed to be correct.

08:49.360 --> 08:54.400
So, with that, I hope you enjoyed the presentation, if you learned something new,

08:54.400 --> 08:58.960
and you now consider maybe contributing to Swift.

08:58.960 --> 09:00.960
Thank you.

