WEBVTT

00:00.000 --> 00:06.000
All right, I think we'll kick off.

00:06.000 --> 00:07.000
Thank you, David.

00:07.000 --> 00:09.000
Thanks a lot, everyone.

00:09.000 --> 00:10.000
I'm Danny Deiner.

00:10.000 --> 00:13.000
I'm presenting Bill your own timeline algorithm

00:13.000 --> 00:17.000
or as I like to call it, it's Deota.

00:17.000 --> 00:20.000
It also sounds like Bill your own travel agent.

00:20.000 --> 00:22.000
I think there was another project like that.

00:22.000 --> 00:23.000
Sorry about it.

00:23.000 --> 00:25.000
So, Bill, it's not a new algorithm.

00:25.000 --> 00:27.000
It's not an end user application.

00:27.000 --> 00:29.000
It's not a service.

00:29.000 --> 00:32.000
To me, it's a pacifist called for arms.

00:32.000 --> 00:36.000
This is why I chose the blues brothers instead of Uncle Sam.

00:36.000 --> 00:38.000
It's a personal project.

00:38.000 --> 00:40.000
I worked on in the last couple of years.

00:40.000 --> 00:44.000
It took a while, but I had to wait for the right tools to be around.

00:44.000 --> 00:47.000
And you will see it's just something I put together by it.

00:47.000 --> 00:49.000
Putting together great tools.

00:49.000 --> 00:51.000
How did this start?

00:51.000 --> 00:54.000
It all started with a sink.

00:54.000 --> 00:58.000
And you'll know the consequences of this sink, right?

00:58.000 --> 01:03.000
So, a lot of people moved from Twitter to other open social networks.

01:03.000 --> 01:05.000
And to me, it was a bit different.

01:05.000 --> 01:06.000
It started six months earlier.

01:06.000 --> 01:08.000
I was at Twitter employee.

01:08.000 --> 01:12.000
And I joined a smaller cohort and tried to look around

01:12.000 --> 01:16.000
for something that I felt was a bit more in control.

01:16.000 --> 01:20.000
I started to think about how to put something more in control of people.

01:20.000 --> 01:22.000
The networks were there already.

01:22.000 --> 01:24.000
They were already in control of people.

01:24.000 --> 01:27.000
The only thing which was missing in my opinion was an algorithm.

01:27.000 --> 01:29.000
A lot of people don't think we need one.

01:29.000 --> 01:31.000
Some of them might think about that.

01:31.000 --> 01:33.000
I will tell you what I think about this.

01:33.000 --> 01:35.000
So, timeline algorithms.

01:35.000 --> 01:37.000
We don't have a lot of time to delve deep into them.

01:37.000 --> 01:41.000
The only things that will leave you are a very, very short definition.

01:41.000 --> 01:46.000
They define weather and how posts appear on your timeline.

01:46.000 --> 01:52.000
I also put reverse chronological there because to me, that's an algorithm too.

01:52.000 --> 01:55.000
And if you want to delve deeper into, let's say,

01:55.000 --> 01:58.000
a platform recommendations and timeline algorithms.

01:58.000 --> 02:01.000
I suggest you this blog post, which in my opinion was great,

02:01.000 --> 02:05.000
written by Luke Thorv, who's in research on this specific stuff.

02:05.000 --> 02:10.000
The other thing I studied, and I worked on, were all the past projects

02:10.000 --> 02:13.000
that had been built in the Fedivers about this.

02:13.000 --> 02:15.000
I'm really standing on the shoulder of giants.

02:15.000 --> 02:19.000
I learned a lot just by seeing how people were trying to build new things

02:19.000 --> 02:23.000
and how these were accepted by the people on the Fedivers.

02:23.000 --> 02:27.000
So, by putting all of these together, I came up with some problems

02:27.000 --> 02:31.000
that timeline algorithms more or less all have.

02:31.000 --> 02:34.000
And they don't need to have all of these problems.

02:34.000 --> 02:37.000
Some of them have a few, some of them have more.

02:37.000 --> 02:40.000
And I think they are starting from an assumption, which was,

02:40.000 --> 02:45.000
we need to be something to allow users to make users come back to our platform.

02:45.000 --> 02:50.000
And I think if we drop that assumption, we can actually find many solutions to these problems.

02:50.000 --> 02:53.000
And I will not have deeper into each of them, but just tell you,

02:53.000 --> 02:57.000
putting together different tools, we can try and tackle this.

02:57.000 --> 03:00.000
So, Biotta is built on different kind of tools.

03:00.000 --> 03:04.000
I told you, one of them is called Lama File, another one is called Marimo.

03:04.000 --> 03:06.000
Another one you already know, it's master.

03:06.000 --> 03:08.000
I actually mastered on the pie library.

03:08.000 --> 03:11.000
So, another talk about pie-tonal libraries for clients,

03:11.000 --> 03:15.000
because you can actually find plenty, any social network,

03:15.000 --> 03:17.000
which is not so stupid to remove.

03:17.000 --> 03:21.000
Sorry, I will rephrase it, which is intelligent enough to give you an API.

03:21.000 --> 03:25.000
We'll allow you to get your posts.

03:25.000 --> 03:29.000
You can also use summaries from an RSS feed, if you want.

03:29.000 --> 03:32.000
As long as you get text, it's going to work.

03:32.000 --> 03:36.000
Lama File is a single file language model.

03:36.000 --> 03:40.000
I purposely removed LLM, I just wrote language model,

03:40.000 --> 03:42.000
because it doesn't need to be large.

03:42.000 --> 03:46.000
There's one that I'm using here, which is called Old Minielem,

03:46.000 --> 03:48.000
which is 46 megabytes.

03:48.000 --> 03:51.000
It has a valuable code, it's an academic project,

03:51.000 --> 03:54.000
and it has information about the data sets that have been used

03:54.000 --> 03:56.000
to generate its own embeddings.

03:56.000 --> 04:00.000
It's Lama File, runs it, it's 100% local.

04:00.000 --> 04:04.000
It's optimized to run on slower and older machines,

04:04.000 --> 04:06.000
and it's used to calculate these status embeddings.

04:06.000 --> 04:08.000
If you don't know what's an embedding is,

04:08.000 --> 04:10.000
let's say it's numerical descriptors

04:10.000 --> 04:13.000
that are the most similar, the most close,

04:13.000 --> 04:16.000
the more the posts that you're trying to embed

04:16.000 --> 04:19.000
are similar, semantically similar, in terms of content.

04:19.000 --> 04:23.000
Marimo is a reactive notebook in Python.

04:23.000 --> 04:25.000
It's shareable as an application,

04:25.000 --> 04:28.000
and running your browser as it wasn't powered,

04:28.000 --> 04:30.000
standalone HTML file.

04:30.000 --> 04:33.000
Let me pose a second, tell you again.

04:33.000 --> 04:37.000
You can clone my codes.

04:37.000 --> 04:40.000
You can customize the algorithm right your own.

04:40.000 --> 04:43.000
You can serve it on a solar powered,

04:43.000 --> 04:46.000
Raspberry Pi HTML server,

04:46.000 --> 04:49.000
HTTP server, maybe that's a bit too much,

04:49.000 --> 04:50.000
but you can do that.

04:50.000 --> 04:52.000
The code is not going to run there,

04:52.000 --> 04:54.000
somebody else is going to download this file,

04:54.000 --> 04:57.000
run all the code on their own laptop locally,

04:57.000 --> 05:00.000
and it will run your own algorithm.

05:00.000 --> 05:02.000
This is how Marimo looks like.

05:02.000 --> 05:05.000
It's like a typical Python notebook.

05:05.000 --> 05:08.000
It's not the ideal user interface.

05:08.000 --> 05:11.000
This is how the application version looks like.

05:11.000 --> 05:14.000
All the code goes away and just the UI elements are there.

05:14.000 --> 05:17.000
So you start making something available to people.

05:17.000 --> 05:20.000
This is an extra feature, the green feature,

05:20.000 --> 05:23.000
well, people can compose and move different parts of the UI

05:23.000 --> 05:27.000
so that they can use them differently for their own use cases.

05:27.000 --> 05:30.000
In practice, what does the tool do?

05:30.000 --> 05:32.000
Embadding visualization.

05:32.000 --> 05:34.000
Here you can see embeddings from four different timelines.

05:34.000 --> 05:37.000
I took the embeddings, plotted them into the,

05:37.000 --> 05:40.000
this is my home timeline, the people I follow.

05:40.000 --> 05:42.000
This one is the local timeline.

05:42.000 --> 05:44.000
This is the federated one,

05:44.000 --> 05:48.000
and this is the one that I get when I look for the go for tag.

05:48.000 --> 05:50.000
Yes, I'm an other go for fun.

05:50.000 --> 05:52.000
So if you put all of these together,

05:52.000 --> 05:54.000
you can generate this map.

05:54.000 --> 05:56.000
It's kind of geographical.

05:56.000 --> 05:58.000
This is how I call the different parts.

05:58.000 --> 06:01.000
It's fun and nice and I spent a lot of time preparing that.

06:01.000 --> 06:06.000
The nice thing that it shows how posts from different timelines

06:06.000 --> 06:08.000
can actually be grouped together,

06:08.000 --> 06:11.000
and you can get recommendations from one timeline to the other,

06:11.000 --> 06:12.000
if you want.

06:12.000 --> 06:15.000
And how can you use this stuff?

06:15.000 --> 06:17.000
Apart from looking at these things,

06:17.000 --> 06:18.000
and saying for instance,

06:18.000 --> 06:20.000
this algorithm is a bit weird.

06:20.000 --> 06:22.000
Family and parenting embeddings close,

06:22.000 --> 06:25.000
are close to not say for work stuff.

06:25.000 --> 06:29.000
Just the heads up, the red ones are the public timeline.

06:29.000 --> 06:30.000
It's not my personal time.

06:31.000 --> 06:33.000
You can also do semantics search.

06:33.000 --> 06:34.000
You can take a sentence,

06:34.000 --> 06:37.000
such as I'm a fan of free software and everything open source,

06:37.000 --> 06:40.000
and get posts which are related to that thing,

06:40.000 --> 06:44.000
or you can get the post ID from the previous embedding plotting,

06:44.000 --> 06:47.000
and just give the number and see the post which are close to that.

06:47.000 --> 06:49.000
You can do re-ranking,

06:49.000 --> 06:52.000
that is, you can have your own numerical timeline,

06:52.000 --> 06:54.000
and in terms of performance,

06:54.000 --> 06:56.000
it will run on your laptop.

06:56.000 --> 06:58.000
I tried it on a 2016 MacBook.

06:58.000 --> 07:00.000
It runs in 40 seconds,

07:00.000 --> 07:02.000
and you can embed your stuff.

07:02.000 --> 07:04.000
So it's fully local.

07:04.000 --> 07:06.000
You can embed stuff without connecting anywhere else.

07:06.000 --> 07:08.000
It runs in your browser.

07:08.000 --> 07:10.000
We only think you download these libraries

07:10.000 --> 07:12.000
that Mario wants to use,

07:12.000 --> 07:15.000
and then you only connect to your API

07:15.000 --> 07:18.000
and to the local embedding server.

07:18.000 --> 07:21.000
And this is a call for arms as I told you before.

07:21.000 --> 07:23.000
I would like it to grow as a tool

07:23.000 --> 07:25.000
that people can use to build stuff

07:25.000 --> 07:27.000
and easily share it with people.

07:27.000 --> 07:29.000
And once more, you will see the protocols

07:29.000 --> 07:31.000
not flat first thing.

07:31.000 --> 07:33.000
It would be great if you had something like a proxy

07:33.000 --> 07:36.000
that runs in between the client and any kind of server

07:36.000 --> 07:38.000
that does this thing for you.

07:38.000 --> 07:39.000
So this is it.

07:39.000 --> 07:41.000
Thanks a lot for hearing me.

07:41.000 --> 07:42.000
And thank you.

07:48.000 --> 07:50.000
Alright, thank you David.

07:50.000 --> 07:52.000
That was awesome.

07:52.000 --> 07:54.000
So we've had three short talks really close together.

07:54.000 --> 07:57.000
We haven't had time for questions for those.

07:57.000 --> 07:59.000
The speakers I'm sure will be available somewhere

07:59.000 --> 08:00.000
in the Fediverse.

08:00.000 --> 08:04.000
And we'd love to hear from you if you've got questions for them.

08:04.000 --> 08:07.000
Could folks just try and use a few more of these seats

08:07.000 --> 08:10.000
that are still sitting in the middle here

08:10.000 --> 08:12.000
because we've got lots of people coming and going

08:12.000 --> 08:13.000
which is wonderful.

08:13.000 --> 08:15.000
I want to make sure that everybody gets a chance

08:15.000 --> 08:17.000
to set comfortably.

08:17.000 --> 08:21.000
So for those of you that don't know me at all.

08:22.000 --> 08:27.000
My name is Andy Piper and I work with the MasterDone project

08:27.000 --> 08:30.000
but I also try and contribute across the Fediverse

08:30.000 --> 08:34.000
and I'm really excited to welcome our next speaker.

