WEBVTT

00:00.000 --> 00:27.080
Hello, hi, hello, I'm Siza, I'm a newcomer from the Aida and Spark community and the

00:27.120 --> 00:35.120
ecosystem and I have been studying cryptography and cyber security and so that's what inspired

00:35.120 --> 00:41.760
me to write a big integer library for cryptography in Spark.

00:41.760 --> 00:50.080
So when we need a constant timer, cryptography printed like big integers, we need them to

00:50.080 --> 00:57.040
avoid some side channels in cryptography code and I wanted to be able to provide some

00:57.040 --> 01:02.040
strong creatives for building stuff in Aida.

01:02.040 --> 01:10.440
So yeah, I wrote a big integer library in Spark with designs and algorithm that were

01:10.440 --> 01:17.480
inspired by Rust CryptoBigintCrate and all the functions that I wrote are in constant

01:17.480 --> 01:24.680
time explicitly named as search, but yeah, it's still some experimental code so I would

01:24.680 --> 01:31.680
advise to use it in some security sensitive context.

01:31.680 --> 01:36.440
But what I've been talking about constant time since the beginning, but what does it

01:36.440 --> 01:42.680
actually mean in a context of big integers?

01:42.680 --> 01:48.480
A constant timer algorithm is an algorithm that will not have dependencies in terms of

01:48.480 --> 01:54.960
resources on secret values and values that we want to remain private.

01:54.960 --> 02:02.160
So it will not leak data depending on the, depending on branches on memory accesses or

02:02.160 --> 02:11.000
timing data and all the operations that you can do on those structures are always take

02:11.000 --> 02:12.840
the same time in the same resources.

02:12.840 --> 02:18.240
So we have no branching on secret values, we have no memory accesses that depend on secret

02:18.240 --> 02:27.040
values to avoid leaks due to cache access latency and such things.

02:27.040 --> 02:36.720
And for instance, there's a famous kind of famous vulnerability in some area say code that

02:36.720 --> 02:44.320
would leak some information about the secret area say experience by doing binary explanation

02:44.320 --> 02:50.400
and taking more time during the one than during the zero, sorry about the creative

02:50.400 --> 02:59.560
the picture, but in some constant encode like that, the code is usually less readable

02:59.560 --> 03:04.400
because we have to use different techniques and avoid certain patterns that we don't

03:04.400 --> 03:06.400
want to have in our code.

03:06.400 --> 03:10.600
It's usually a bit slower because we need to always take the maximum amount of time for

03:10.680 --> 03:18.600
an operation and sadly also it's always a fight against the compiler because some of

03:18.600 --> 03:24.920
the amazing compilers might recognize some constant time algorithms, some constant time construct

03:24.920 --> 03:33.400
and decide to replace it by a faster and variable time construct.

03:33.400 --> 03:40.120
So for instance, in the binary explanation, classical example, we transform the branch

03:40.120 --> 03:46.680
depending on the on the bit of the exponent into a conditional select that is implemented

03:46.680 --> 03:56.520
using masking and bitwise arithmetic and we must take care to transform our condition into

03:56.520 --> 04:03.560
a mask properly in order to avoid some compiler optimizations and that's all well and good,

04:03.560 --> 04:13.000
but it also leads to two code that is, as I said, less riddle and harder to understand,

04:13.560 --> 04:20.040
and that's where I think Spark comes in because we can then prove the primitive that we use

04:21.640 --> 04:25.000
to build our constant time algorithms in Spark with contracts.

04:25.960 --> 04:34.520
So Spark brings, Spark brings obviously the assurance that we have no real time errors,

04:34.520 --> 04:39.480
that we have already determination of algorithms and all that.

04:40.440 --> 04:46.680
It also proves we can also be used to prove our bitwise operation, like the condition select or

04:46.680 --> 04:52.840
the conditional swap and even though we cannot prove that these are effectively running in

04:52.840 --> 05:00.520
constant time because of all the considerations about low level hardware things and

05:00.520 --> 05:05.480
optimizations. We can prove that they effectively do what we want them to do, which is

05:05.480 --> 05:13.720
select the values and swap the values as we want. And in fact, all constant input in essence

05:13.720 --> 05:19.800
is if it's not written in pure assembly, it always relies on best default implementations.

05:20.520 --> 05:25.720
We have to confiscate our intent from the compiler. We have to have the optimization barriers and

05:26.760 --> 05:32.120
the best we can do to actually ensure that our code is constant time is to audit the assembly

05:32.120 --> 05:39.560
output and measure the resources and use by our code when it's running.

05:40.280 --> 05:50.360
Yeah. In the library, on top of having proofs for low level and bitwise operations,

05:50.360 --> 05:58.920
I wanted to have some higher level proofs including a proof that it wrote for proving the addition

05:58.920 --> 06:07.080
of two beginners. For that, I was quite surprised by the amounts of code that you need to

06:08.040 --> 06:18.040
simple things in this part, but I think that's for most proofs in formal reasoning.

06:18.040 --> 06:26.840
It takes always much more spark code and goes code than it takes actual logic to prove what we

06:26.840 --> 06:35.000
want to prove. That proof is available on the GitHub repository, but it's not yet proved

06:35.080 --> 06:43.160
all by the GitHub proof alia package. It's in a separate branch, but it works.

06:48.040 --> 06:53.000
My main inspiration at first, because I was trying out Spark to write this library was to

06:53.000 --> 06:57.880
try and write a small elliptic curve implementation, and so that's what I did in a small test.

06:57.880 --> 07:06.120
Where I used the test vectors of the network to test my implementation. We can define

07:06.120 --> 07:12.200
operations on a prime field by instantiating a generic package by defining the prime you want,

07:12.200 --> 07:17.320
either at compile them or at runtime, and then instantiating the package you want.

07:17.320 --> 07:23.320
And the first version of the library is available in the alia index right now.

07:23.720 --> 07:31.720
And yet that's.

07:39.080 --> 07:43.720
Any questions there?

07:43.720 --> 07:49.960
And you say that actually constantly it's time code needs to be written assembly,

07:49.960 --> 08:02.120
is there a choice of the alia or because I can imagine different compilers in the

08:02.120 --> 08:09.720
different specific code, but also different CPUs doing interpreting differently,

08:10.680 --> 08:17.560
or maybe micro to specialize in trying the history within the CPU?

08:17.560 --> 08:28.840
Yeah, so how does a console name code really can be ensured to run?

08:28.840 --> 08:35.880
Because even in assembly on different architectures and all that, it's kind of an open question,

08:35.880 --> 08:43.640
because as I said in the slides, even a simple instruction like div can be running

08:43.640 --> 08:53.400
in variable time depending on the appearance. There are no real solutions today,

08:53.400 --> 09:01.400
I think, to that program, and it's mostly guessing and trying to do our best. But to actually

09:01.480 --> 09:05.480
prove that some stuff is running in custom time, you will need a really deep integration with

09:05.480 --> 09:15.800
the compiler and architecture, a language, and all that, and it's kind of an open question.

09:32.360 --> 09:37.240
It's probably constant time, so some sort of why it was in the company that said in the end,

09:37.240 --> 09:44.520
there should be a many compilers, and I think that would be a proper solution.

09:44.520 --> 09:49.960
I'm like compilers like, or I don't know, like, in Seattle, maybe you can track them out of

09:49.960 --> 09:56.840
something that could save this section of folks that use only constant time.

09:56.920 --> 10:07.000
So the question is that the observation is that such a program should be solved at the compiler

10:07.000 --> 10:13.560
level, mostly, and with specializations, specializations, annotations to say that such block

10:13.560 --> 10:17.640
is in constant time and all that, and I can just say that.

10:17.640 --> 10:19.640
I agree completely because, yeah.

10:31.240 --> 10:37.240
Well, yeah, that's such a lot.

10:39.240 --> 10:41.240
Where is what I want to do?

10:41.240 --> 10:45.960
So that those constructs, and all that is kind, it's just manual branch,

10:46.120 --> 10:50.200
it's just transforming the condition into a mask, and then

10:50.200 --> 10:54.440
performing the two operations, and then just choosing the operation that we want.

10:56.440 --> 11:01.080
It's a manual implementation of just doing branchless learning, in fact.

11:01.480 --> 11:03.480
Yeah.

11:17.480 --> 11:23.480
So in the concept primitive, I used a different time than Napoleon to explore the condition.

11:24.440 --> 11:27.240
I chose to use in the concept.

11:27.240 --> 11:31.960
I chose to use an opact type called choice that is in itself a mask.

11:31.960 --> 11:37.160
So it's not a 0 or 1, it's a 0 or the maximum value of a U64.

11:37.880 --> 11:46.680
And the reason is to kind of obfuscate what I want to do to the compiler, and also to avoid

11:46.680 --> 11:51.960
doing some transformation all over the place, for instance, there are some operations that

11:52.040 --> 11:55.160
already give me a mask, and I just want to be able to reuse it.

11:57.000 --> 12:02.680
And so yeah, it just built another primitive to do that.

12:06.920 --> 12:11.240
Yeah, you should try to get a conversation between the two, the variable time and the

12:11.800 --> 12:13.000
time and the time and the issue.

12:13.000 --> 12:19.000
Because once you see, we have, like, the need to teach you to see some missions that

12:19.000 --> 12:24.280
have been done, but there will be a lot of different locations need to be done, so

12:24.280 --> 12:24.920
time.

12:24.920 --> 12:28.520
So there's a lot of research there, and perhaps we can now switch to something

12:28.520 --> 12:31.080
like, you know, nothing which is rather nice.

12:31.080 --> 12:36.680
And see whatever you can use in the XOI I was wondering, where it's, you can use that one

12:36.760 --> 12:41.640
and it's pretty good, so I need to use it in the process, which is not a good way.

12:43.640 --> 12:51.160
The question is whether the code is usable in performance critical environments.

12:51.160 --> 12:59.960
And I would say that since the code is proven by Spark to be without front types of errors

13:00.040 --> 13:06.200
and all that, it can be used without assertions and all that, so that kind of

13:06.200 --> 13:15.400
consideration is really good. And there's always the fact that that we have to run

13:15.400 --> 13:19.960
operations in the maximum amount of time in that bit. Yeah.

13:22.600 --> 13:28.520
The library that I worry is not optimized for performance, there are some

13:28.600 --> 13:34.360
operations that I don't implement in the best way, but in the the simplest way, sometimes.

13:34.360 --> 13:39.800
So yeah, it could be used maybe with some development that you know.

13:43.080 --> 13:46.600
There's not a lot of time left, so that may be one more question.

13:48.840 --> 13:53.880
Maybe I don't think there should be that much difference between parallel and constant

13:53.880 --> 14:01.640
code because I think that's a problem with this because if we have much entropy, then

14:01.640 --> 14:06.520
I always have to know that I don't need to do that notification.

14:06.520 --> 14:11.240
In like half the case and both, there's a lot of it and you will have to do the notification.

14:11.240 --> 14:15.880
So I think that there's not much loss in the average case,

14:15.880 --> 14:21.880
maybe half a pass or something, but what I want to ask is that if that's enough

14:22.040 --> 14:24.040
I think that's enough.

14:25.640 --> 14:31.000
Am I implementing a carousel by multiplication? No, not yet. It could be. I could do it,

14:31.000 --> 14:39.000
but I haven't. In the original rest grade that I got heavily inspired from,

14:40.920 --> 14:46.120
there is an implementation that decides at compile time whether to use carousel or

14:47.080 --> 14:54.360
or classical multiplication, and I just copied the classical implementation of multiplication, because yeah.

14:59.400 --> 15:07.000
During the implementation, how often did you have to actually look at the, there's a symbol of

15:07.960 --> 15:21.240
the implementation that I have to look at often at the assembly output, and I did look quite

15:21.240 --> 15:26.360
often at it because I wanted also to check whether my code was vectorized properly,

15:26.360 --> 15:32.520
but the compiler under a lot. So I looked for, when building the parameters, I looked

15:32.520 --> 15:40.280
quite often at the assembly output, and so yeah, it's maybe not a necessity always, but yeah.

15:43.560 --> 15:45.800
So thank you.