WEBVTT

00:00.000 --> 00:07.000
Thank you very much and thanks everyone for coming.

00:07.000 --> 00:11.000
That's amazing to see the full house here in the search Devroom and excited to be here

00:11.000 --> 00:17.000
at first them 2026 and on this session we'll talk about open search and search innovations

00:17.000 --> 00:19.000
with open search v3.

00:19.000 --> 00:21.000
I'm Dutant Horvets.

00:21.000 --> 00:24.000
I'm open search ambassador under the Linux Foundation.

00:24.000 --> 00:29.000
I'm also a sincere ambassador and then my day Joe by Miss Senior open source advocate at AWS.

00:29.000 --> 00:38.000
And I'll be co-speaking with Ashwatt Senior Search Engine architect for open search at AWS.

00:38.000 --> 00:44.000
So for those who don't open search it's essentially a platform for AI powered search,

00:44.000 --> 00:47.000
observability and analytics.

00:47.000 --> 00:54.000
On the search side it provides the full suite from lexical search to a dense vector search,

00:54.000 --> 01:00.000
we have the interesting talk now hybrid search, multi-model search, agentex search.

01:00.000 --> 01:03.000
So the full end to end search capabilities.

01:03.000 --> 01:07.000
It's Apache 2.0 open source license.

01:07.000 --> 01:11.000
It's a top-level project under the Linux Foundation.

01:11.000 --> 01:16.000
It started back in 2021 as a fork of elastic search and kabana.

01:16.000 --> 01:22.000
After these projects were released from Apache 2 to non-open source licensing scheme.

01:22.000 --> 01:27.000
So the community wanted to keep them open, AWS stepped up and initiated the fork.

01:27.000 --> 01:33.000
And then fast forward to 2024, AWS transferred the project to the Linux Foundation.

01:33.000 --> 01:36.000
So it's owned by the Linux Foundation or any vendor.

01:36.000 --> 01:44.000
And there's the open search software foundation under the LF to foster open governance and open collaboration.

01:44.000 --> 01:50.000
And I'm really happy to say that many companies have already joined the foundation.

01:50.000 --> 01:55.000
And you see some examples here, Uber, SAP, NetApp, IBM and others.

01:55.000 --> 02:01.000
And are taking part in studying the future of this project.

02:01.000 --> 02:09.000
This is an asset code base, 140 reports under the GitHub org, 1.6 billion downloads to date.

02:09.000 --> 02:16.000
It's a super active project, more than 3,000 active contributors from more than 400 different organizations.

02:16.000 --> 02:19.000
It actually, I think, I checked the LFX leaderboard.

02:19.000 --> 02:25.000
It's I think the 12th top active project under the LF amongst the nearly 1000 projects.

02:25.000 --> 02:28.000
So pretty impressive.

02:28.000 --> 02:34.000
And the biggest news in the past year is being that after 3 years of 2.x,

02:34.000 --> 02:38.000
we finally reached version 3, the next major release.

02:38.000 --> 02:42.000
And this major release has foundational changes.

02:42.000 --> 02:46.000
Also Apache Lucine moving to version 10.

02:46.000 --> 02:54.000
We have architectural changes, such as the reader write a separation to separate between the search and the indexing workloads.

02:54.000 --> 02:59.000
Streaming architecture, pool based ingestion, native MCP support.

02:59.000 --> 03:06.000
Many new commands on the PPL, the piped processing language, the query language we have as part of the project.

03:06.000 --> 03:11.000
And support out of the box for Apache calcite as the engine and much much more.

03:11.000 --> 03:15.000
So on this session, we'd like to share with you some of the highlights of V3.

03:15.000 --> 03:19.000
And I would like to hand it over to us to start showing us this stuff.

03:19.000 --> 03:20.000
Go ahead, that's what.

03:20.000 --> 03:21.000
Thank you.

03:27.000 --> 03:28.000
Okay.

03:28.000 --> 03:31.000
So, I'll start, thank you.

03:31.000 --> 03:34.000
The passing code is the neural spaces with an open search.

03:34.000 --> 03:37.000
I like to call this as a budget-friendly semantic search.

03:37.000 --> 03:41.000
And if you have used the dense vectors, you will probably agree with me.

03:41.000 --> 03:47.000
The neural space model that is there in hacking phase, pre-trained by open search,

03:47.000 --> 03:51.000
is one of the most popular, most downloaded space model.

03:51.000 --> 03:55.000
But the catch with the neural spaces has always been the scalability.

03:55.000 --> 03:57.000
Beyond 10 million documents.

03:57.000 --> 03:59.000
The latency is...

04:00.000 --> 04:03.000
Yeah.

04:03.000 --> 04:07.000
So, beyond 10 million documents, the latency is quite high.

04:07.000 --> 04:10.000
To an extent where I start recommending dense vectors.

04:10.000 --> 04:14.000
But at that point, we are no longer in the budget-friendly semantic search.

04:14.000 --> 04:18.000
So, with this seismic algorithm implementation,

04:18.000 --> 04:21.000
we are able to push past 10 million ceiling.

04:21.000 --> 04:25.000
We are able to go to billion with the B.

04:26.000 --> 04:30.000
And this is by being able to do approximate sparse retrieval.

04:30.000 --> 04:36.000
So, with the new data structure, the forward index that actually has the dot-hyd to the

04:36.000 --> 04:38.000
first pass vector mapping.

04:38.000 --> 04:41.000
And then we have the inverted index, which probably most of you understand,

04:41.000 --> 04:46.000
but with some slight changes, where the posting list goes through a pruning process.

04:46.000 --> 04:49.000
It only retains that top weighted dot IDs.

04:50.000 --> 04:55.000
So, for example, let's say the term, dot, exist in dot 1, dot 2, dot 5.

04:55.000 --> 04:59.000
And each document that term, dot exists with different weights.

04:59.000 --> 05:04.000
So, pruning essentially removes those dots that have lower weights.

05:04.000 --> 05:10.000
And then what we do is the group similar documents, forming a cluster.

05:10.000 --> 05:13.000
And for every cluster, as summary vector is created.

05:13.000 --> 05:16.000
So, this is what you see in block 1, block 2.

05:16.000 --> 05:19.000
So, it's basically the cluster with the summary.

05:19.000 --> 05:24.000
And now that the index is created, during search time, the query vector,

05:24.000 --> 05:26.000
queries against the summary vector.

05:26.000 --> 05:30.000
If the score is too low, the cluster is simply skipped.

05:30.000 --> 05:33.000
If the score is decent enough, then the cluster is examined.

05:33.000 --> 05:38.000
And for the full final score, we go to the forward index, test the full

05:38.000 --> 05:41.000
speed to do the final score calculation.

05:41.000 --> 05:43.000
So, this is how it works.

05:43.000 --> 05:44.000
I mean just to summarize.

05:44.000 --> 05:48.000
So, ingestion, then pruning happen, pruning process happens.

05:48.000 --> 05:52.000
And then similar documents are grouped together, forming a cluster.

05:52.000 --> 05:54.000
And every cluster gets a summary.

05:54.000 --> 05:57.000
Lock 1, block 2, search happens.

05:57.000 --> 06:00.000
When the search happens, we query the summary vector.

06:00.000 --> 06:03.000
If the score is too low, the cluster is skipped.

06:03.000 --> 06:08.000
If the score is big enough, large enough, then we examine the cluster.

06:08.000 --> 06:10.000
And the cluster is examined.

06:10.000 --> 06:14.000
The full vector is gotten from the forward index, and the final scoring is time.

06:14.000 --> 06:19.000
So, this is how we are able to do approximate sparse retrieval.

06:19.000 --> 06:24.000
And this is how we are also able to scale beyond 10 million ceiling that we have in the past.

06:24.000 --> 06:25.000
Yeah.

06:25.000 --> 06:26.000
So, give it a shot.

06:26.000 --> 06:27.000
Give it a try.

06:27.000 --> 06:28.000
Let us know how it goes.

06:28.000 --> 06:29.000
You know the drill.

06:29.000 --> 06:30.000
Yeah.

06:30.000 --> 06:37.000
So, we three GTE is the Sata Infran3 pre-trained model that I was talking about.

06:37.000 --> 06:40.000
The most popular, most downloaded hugging case model.

06:40.000 --> 06:41.000
Yeah.

06:41.000 --> 06:42.000
And like I said, right.

06:42.000 --> 06:46.000
So, the benchmark that we did over 1.3 billion dot set.

06:46.000 --> 06:48.000
The benchmark that we did.

06:48.000 --> 06:50.000
Look at the box here.

06:50.000 --> 06:55.000
The latency with the BM25 is baseline.

06:55.000 --> 06:59.000
We were able to get four times faster than BM25.

06:59.000 --> 07:04.000
And about 11 times faster than the traditional neural sparse.

07:04.000 --> 07:05.000
Yeah.

07:05.000 --> 07:10.000
And the line about tells you the recall, which is a slight drop in recall.

07:10.000 --> 07:17.000
Has a such architect, such practitioner for me, a slight drop in recall, with such a benchmark

07:17.000 --> 07:18.000
latency that I get.

07:18.000 --> 07:20.000
It's a tremendous trade-off for me.

07:20.000 --> 07:21.000
I'm not sure what you.

07:21.000 --> 07:22.000
Yeah.

07:22.000 --> 07:23.000
So, yeah.

07:23.000 --> 07:24.000
Please try.

07:24.000 --> 07:25.000
Give it a shot.

07:25.000 --> 07:26.000
Yeah.

07:26.000 --> 07:28.000
So, I will move on to a new topic.

07:28.000 --> 07:30.000
System generated such pipeline.

07:30.000 --> 07:34.000
But in order for me to explain this one, you need to know that such pipelines exist with open

07:34.000 --> 07:35.000
search.

07:35.000 --> 07:38.000
And such pipelines are very similar to the ingest pipelines.

07:38.000 --> 07:42.000
A bunch of processors for data transformation that you put together.

07:42.000 --> 07:44.000
But on the set site, right.

07:44.000 --> 07:48.000
The only difference is the processors of different types.

07:48.000 --> 07:52.000
It could be request or response processor or a face processor.

07:52.000 --> 07:58.000
To give you some tangible examples, the normalization processor for normalizing the score

07:58.000 --> 08:00.000
between lexical and cane and search.

08:00.000 --> 08:05.000
And they cross-encorer for re-ranking and the semantic highlighter.

08:05.000 --> 08:06.000
Yeah.

08:06.000 --> 08:10.000
So, these are different processors that you put together to create a search pipeline.

08:10.000 --> 08:15.000
But the challenge is not the challenge, but the problem is always that you have to create

08:15.000 --> 08:21.000
them in advance, define them in advance, register them in advance before you are

08:21.000 --> 08:23.000
able to use them.

08:23.000 --> 08:24.000
Right.

08:24.000 --> 08:27.000
So, as the name suggests, the system generated such pipeline.

08:27.000 --> 08:31.000
It attempts to eliminate the need of having to create them in advance.

08:31.000 --> 08:37.000
So, to give you an example, if you are implementing e-commerce search, let's say you want to

08:37.000 --> 08:39.000
diversify your search results.

08:39.000 --> 08:42.000
And you want to have MMR, maximal, marginal relevancy.

08:42.000 --> 08:45.000
Now, you don't have to create these search pipelines in advance.

08:45.000 --> 08:48.000
You simply have to mention it in a such class.

08:48.000 --> 08:51.000
And the search pipeline is automatically created for you on the fly.

08:51.000 --> 08:55.000
So, it removes the operational overhead that you have.

08:55.000 --> 09:00.000
But the only setting is it is not available for all the search processes out there yet.

09:00.000 --> 09:01.000
Yeah.

09:01.000 --> 09:04.000
So, you still have to create the search pipelines.

09:04.000 --> 09:09.000
And for anyone who is new, there is a steep learning curve because you need to use the

09:09.000 --> 09:11.000
APIs to create the search pipelines.

09:11.000 --> 09:12.000
Yeah.

09:12.000 --> 09:18.000
So, once again to reduce the learning curve to cut out the learning curve, we unfor those who like

09:18.000 --> 09:20.000
the click-click create approach.

09:20.000 --> 09:21.000
So, we have this.

09:21.000 --> 09:23.000
So, you have, as you see here, right.

09:23.000 --> 09:27.000
So, different templates for hybrid search, multi-modal search, rag.

09:27.000 --> 09:28.000
And also agent search.

09:28.000 --> 09:29.000
Yeah.

09:29.000 --> 09:32.000
So, you have different templates which you can use.

09:32.000 --> 09:37.000
And you can also use the custom search if you want to, you know, go beyond the basics that we have.

09:37.000 --> 09:39.000
And it would look something like this.

09:39.000 --> 09:41.000
This is a full-blown screenshot.

09:41.000 --> 09:42.000
But they saved me.

09:42.000 --> 09:43.000
So, it goes like this.

09:43.000 --> 09:45.000
So, you create the ingest pipeline.

09:45.000 --> 09:46.000
Right.

09:46.000 --> 09:48.000
You can add a bunch of processes into it.

09:48.000 --> 09:52.000
I have the ML inference processor that can convert text to embeddings.

09:52.000 --> 09:55.000
And then, of course, you can add more processes.

09:55.000 --> 09:56.000
This does an example.

09:56.000 --> 09:57.000
Yeah.

09:57.000 --> 10:01.000
And then, we have the search pipeline where I have the same ML inference processor.

10:01.000 --> 10:04.000
I'm working to the text to embeddings.

10:04.000 --> 10:05.000
Yeah.

10:05.000 --> 10:08.000
And this is how you define what the processor should be doing.

10:08.000 --> 10:12.000
So, I then, it allows me to pick what model I want.

10:12.000 --> 10:16.000
Which field should I look at to convert the text.

10:16.000 --> 10:19.000
And once I convert the text to embeddings, where should I put it to.

10:19.000 --> 10:24.000
So, basically, you can configure the entire processor with the UI.

10:24.000 --> 10:25.000
Yeah.

10:25.000 --> 10:26.000
Not just that.

10:26.000 --> 10:28.000
You can also test.

10:28.000 --> 10:32.000
Simulates the entire flow in here before you start deploying it.

10:32.000 --> 10:33.000
Yeah.

10:33.000 --> 10:36.000
So, it's a very nice utility, a developer tool.

10:36.000 --> 10:37.000
Yeah.

10:37.000 --> 10:39.000
But let's say you are at this point.

10:39.000 --> 10:41.000
You have created the search pipeline.

10:41.000 --> 10:42.000
Everything is good.

10:42.000 --> 10:44.000
You have deployed the next thing that you need to know is.

10:44.000 --> 10:46.000
It's the search pipeline working at all.

10:46.000 --> 10:47.000
Yeah.

10:47.000 --> 10:51.000
So, enter search relevancy workbench and ask some tools.

10:51.000 --> 10:55.000
I would say, especially for your relevancy engineers, your data scientists.

10:55.000 --> 10:56.000
Yeah.

10:56.000 --> 10:58.000
So, as it shows here, right.

10:58.000 --> 11:00.000
You have four different things that you can do.

11:00.000 --> 11:02.000
You can compare single query.

11:02.000 --> 11:04.000
You can like side-by-side compare.

11:04.000 --> 11:06.000
I bought the search results.

11:06.000 --> 11:10.000
You can also do other volume using query set.

11:10.000 --> 11:14.000
You can also do search evaluation, which is basically all the.

11:14.000 --> 11:16.000
Good search quality metrics.

11:16.000 --> 11:18.000
The NDCGs and the positions.

11:18.000 --> 11:19.000
Yeah.

11:19.000 --> 11:23.000
And the hybrid search optimizer, which is slightly different from the rest of it.

11:23.000 --> 11:30.000
So, if you have used hybrid search, regardless of the vector database, you let's say you combine

11:30.000 --> 11:33.000
Mexico search and can and search.

11:33.000 --> 11:34.000
Right.

11:34.000 --> 11:35.000
So, you need to.

11:35.000 --> 11:38.000
There's a lot of parameters, hyper parameters that you need to tune.

11:38.000 --> 11:41.000
So, the normalization technique is that this core or min max or

11:41.000 --> 11:42.000
equilibrium.

11:42.000 --> 11:44.000
And then you have the combination technique.

11:44.000 --> 11:49.000
Is it arithmetic mean or geometric mean or harmonic mean?

11:49.000 --> 11:50.000
Yeah.

11:50.000 --> 11:53.000
And then lastly, you also have the combination.

11:53.000 --> 11:55.000
Distribution, which is.

11:55.000 --> 11:56.000
Legs equal to cannon.

11:56.000 --> 11:59.000
Is it 0.1.9.2.8?

11:59.000 --> 12:00.000
Yeah.

12:00.000 --> 12:04.000
So, you, you then compute the combinations that you would arrive at.

12:04.000 --> 12:07.000
That's way too many combinations.

12:07.000 --> 12:11.000
Then how would you know, this combination works the best for you.

12:11.000 --> 12:13.000
This is what the hybrid search optimizer does.

12:13.000 --> 12:18.000
I will show it in the demo, which is what I'm going to do right now.

12:18.000 --> 12:22.000
So, this is the new open search dashboard.

12:22.000 --> 12:30.000
This is not just the new look and feel, but it also lets you add different open search clusters

12:30.000 --> 12:32.000
to the same open search UI.

12:32.000 --> 12:38.000
So, this we go past the one dashboard to one backend.

12:38.000 --> 12:42.000
Now you are able to connect to different backends, open search clusters.

12:42.000 --> 12:47.000
And you can also create work spaces, a classic multi-tenancy use case, right?

12:47.000 --> 12:49.000
And then you create a workspace.

12:49.000 --> 12:53.000
It could also be of different types of observability or security analytics search

12:53.000 --> 12:54.000
depending upon it.

12:54.000 --> 12:56.000
The menu changes.

12:56.000 --> 13:00.000
And of course, if you want all the menus in the world like me,

13:00.000 --> 13:03.000
I go to the analytics.

13:03.000 --> 13:06.000
I get all the menus in the world, exactly.

13:06.000 --> 13:11.000
So, first thing I'm going to show you is I'm going to show you the agent search.

13:11.000 --> 13:14.000
And I'm going to compare that with the lexical search.

13:14.000 --> 13:19.000
So, for this, I go to the search relevancy work pens, single query comparison.

13:19.000 --> 13:20.000
Yeah?

13:20.000 --> 13:21.000
I have to type the query.

13:21.000 --> 13:23.000
Of course, I'm not going to do that now.

13:23.000 --> 13:25.000
I have it in here.

13:25.000 --> 13:30.000
And what you see here is I'm going to compare agent search with the multi-metge lexical search

13:30.000 --> 13:32.000
that you probably are used to.

13:32.000 --> 13:34.000
I'm going to use a local cluster for both.

13:34.000 --> 13:37.000
I'm going to use the e-commerce index for both.

13:37.000 --> 13:41.000
But of course, on this side, I'm going to use the agentic pipeline plot.

13:41.000 --> 13:43.000
This is a search pipeline that I created.

13:43.000 --> 13:44.000
Yeah?

13:44.000 --> 13:47.000
And I'm going to search for, let's say, Nike.

13:47.000 --> 13:48.000
Yeah?

13:48.000 --> 13:50.000
I'm going to hit a search.

13:50.000 --> 13:52.000
I have a very nice comparison here.

13:52.000 --> 13:55.000
There are three overlaps, seven unique results on each side.

13:55.000 --> 13:57.000
I'm going to change this to title.

13:57.000 --> 14:00.000
Make it a little bit bigger.

14:00.000 --> 14:03.000
So, the results are quite nice on both sides.

14:03.000 --> 14:05.000
But, of course, nobody is going to search for Nike.

14:05.000 --> 14:07.000
Nobody is going to search for jeans, right?

14:07.000 --> 14:09.000
So, they're going to be a bit more explicit, right?

14:09.000 --> 14:11.000
And I want to make a typo here, right?

14:11.000 --> 14:13.000
Nike black.

14:13.000 --> 14:20.000
Okay.

14:20.000 --> 14:21.000
Nike black shoes.

14:21.000 --> 14:23.000
Oh, one more unintentional typo.

14:23.000 --> 14:26.000
But, yeah, agentic search is able to handle it, right?

14:26.000 --> 14:28.000
So, if you now look at the results, right?

14:28.000 --> 14:29.000
This is no longer Nike.

14:29.000 --> 14:30.000
I mean, it is still shoes.

14:30.000 --> 14:31.000
Yeah?

14:31.000 --> 14:32.000
But, it is no longer Nike.

14:32.000 --> 14:36.000
On here, I have much more as a end user.

14:36.000 --> 14:40.000
I find this one much more relevant compared to the other one.

14:40.000 --> 14:41.000
Yeah?

14:41.000 --> 14:43.000
Now, let's continue the journey.

14:43.000 --> 14:49.000
Let's say under $50.

14:49.000 --> 14:50.000
Yeah?

14:50.000 --> 14:55.000
So, now, I no longer get shoes.

14:55.000 --> 14:56.000
I get cookies.

14:56.000 --> 14:57.000
I get whatnot.

14:57.000 --> 14:58.000
Yeah?

14:58.000 --> 15:01.000
Once again, on this side, I have much more relevant results.

15:01.000 --> 15:03.000
Of course, I show socks here.

15:03.000 --> 15:05.000
Also, you know why that is happening.

15:05.000 --> 15:08.000
But, also, $50, you get what you pay for, right?

15:08.000 --> 15:10.000
With this inflation.

15:10.000 --> 15:13.000
So, I'll show you what happens under the hood.

15:13.000 --> 15:14.000
There's no magic.

15:14.000 --> 15:15.000
Yeah?

15:15.000 --> 15:19.000
So, for this, we go to the AI search flow page.

15:19.000 --> 15:21.000
Remember, this is the one I showed you.

15:21.000 --> 15:25.000
And this is the one I use agentic search for creating this workflow.

15:25.000 --> 15:27.000
This is the one that I created.

15:27.000 --> 15:29.000
And the agent I have.

15:29.000 --> 15:31.000
This is the agent that I created.

15:31.000 --> 15:34.000
So, let's run the search first before I start explaining it.

15:34.000 --> 15:36.000
And this is exactly the same search that I did.

15:36.000 --> 15:40.000
And this is exactly what is causing the magic, right?

15:40.000 --> 15:44.000
So, agentic search is able to break down the user query.

15:44.000 --> 15:45.000
Understand the user query.

15:45.000 --> 15:48.000
Break it down into, let's say, 90 is a brand.

15:48.000 --> 15:50.000
Shoes are category.

15:50.000 --> 15:52.000
Black is a color.

15:52.000 --> 15:53.000
And price is a range.

15:53.000 --> 15:54.000
Yeah?

15:54.000 --> 15:55.000
And this is how.

15:55.000 --> 15:56.000
And this is the lexical search.

15:56.000 --> 15:57.000
There is no gain.

15:57.000 --> 15:58.000
And there is no sparsist.

15:58.000 --> 15:59.000
There's nothing.

15:59.000 --> 16:00.000
There's just lexical search.

16:00.000 --> 16:03.000
Just a powerful rewriting of the query, right?

16:03.000 --> 16:04.000
I get the raw results.

16:04.000 --> 16:07.000
The exact same results just to make sure that I'm not cheating.

16:07.000 --> 16:08.000
Yeah?

16:08.000 --> 16:11.000
So, now we go to the raw response, right?

16:11.000 --> 16:17.000
And we see in the category for the first result, right?

16:17.000 --> 16:21.000
So, clothing shoes, jewelry and athletic socks.

16:21.000 --> 16:22.000
No one there.

16:22.000 --> 16:24.000
This is actually showing a socks showing up in the shoe.

16:24.000 --> 16:26.000
Because there's a lot of cataloging problem.

16:26.000 --> 16:29.000
Not an agentic search problem, yeah?

16:29.000 --> 16:31.000
You also get the rewritten query here.

16:31.000 --> 16:33.000
But who's going to rewrite it?

16:33.000 --> 16:35.000
Which is what is happening on this side.

16:35.000 --> 16:37.000
So, we have this query planning tool.

16:37.000 --> 16:39.000
This is the tool for this agent.

16:39.000 --> 16:41.000
And as it says here, right?

16:41.000 --> 16:45.000
It is basically converting the natural language into open search DSL query.

16:45.000 --> 16:46.000
By using an LL.

16:46.000 --> 16:49.000
This is the LLM that I'm using.

16:49.000 --> 16:50.000
Yeah?

16:50.000 --> 16:52.000
Of course, I have another agent.

16:52.000 --> 16:54.000
Because this is open-source platform.

16:54.000 --> 16:59.000
So, we also have a agent that is actually running in Olamma

16:59.000 --> 17:03.000
on my tiny EC2 instance CPU, yeah?

17:03.000 --> 17:05.000
A 7 billion parameter.

17:05.000 --> 17:09.000
I'm saying all these things because it's super slow.

17:09.000 --> 17:10.000
Yeah?

17:10.000 --> 17:15.000
But just to prove a point, this is going to give me the exact query.

17:15.000 --> 17:17.000
If it doesn't fail me.

17:17.000 --> 17:21.000
But the bottom line is you are able to, you know?

17:21.000 --> 17:25.000
You don't need a super special cloud 4.5 model.

17:25.000 --> 17:30.000
You are able to do with a model that is just under 5 gigabytes.

17:30.000 --> 17:31.000
Yeah?

17:31.000 --> 17:34.000
So, you get the same exact query, same exact result.

17:34.000 --> 17:35.000
Yeah?

17:35.000 --> 17:39.000
So, and of course, one last thing I want to say is you can also,

17:39.000 --> 17:42.000
if you want to be a bit more deterministic.

17:42.000 --> 17:43.000
Yeah?

17:43.000 --> 17:45.000
Do not want to LLM to hallucinate.

17:45.000 --> 17:47.000
You can also use such templates.

17:47.000 --> 17:48.000
Yeah?

17:48.000 --> 17:50.000
You can take a bit more control.

17:50.000 --> 17:54.000
I think I'm taking more time than I may be running a bit late.

17:54.000 --> 17:57.000
But I also want to show you these things that I was talking about earlier.

17:57.000 --> 17:58.000
Right?

17:58.000 --> 18:00.000
So, we did a single query comparison.

18:00.000 --> 18:05.000
There is also a possibility to do a volume test with query set comparison.

18:05.000 --> 18:06.000
Right?

18:06.000 --> 18:07.000
Which is what I did here.

18:07.000 --> 18:08.000
Right?

18:08.000 --> 18:12.000
With 150 queries, you get all the ranking similarity metrics here.

18:12.000 --> 18:15.000
There is obviously a bug in here, which is fixed in the next release,

18:15.000 --> 18:17.000
which is what Jordan is going to talk about.

18:17.000 --> 18:20.000
But you basically can, you know, what you did at a single query.

18:20.000 --> 18:21.000
Right?

18:21.000 --> 18:23.000
Can be seen at a volume here.

18:23.000 --> 18:24.000
Right?

18:24.000 --> 18:29.000
But I like to actually show this one, the such event, right?

18:29.000 --> 18:34.000
Because for this one, if you go back, such evaluation, right?

18:34.000 --> 18:38.000
This is also volume as I'm going to pass a query here.

18:38.000 --> 18:40.000
This is just a field at query, right?

18:40.000 --> 18:44.000
But what I also am passing here is the judgment list.

18:44.000 --> 18:50.000
Because for such quality metrics for NDCG and for precision, you need judgment list.

18:50.000 --> 18:54.000
So, open search allows you to add either implicit or explicit judgments.

18:54.000 --> 18:57.000
Explicit, you just have to find it somehow.

18:57.000 --> 19:01.000
Implicit, you can use a project called user behavior inside,

19:01.000 --> 19:03.000
with an open search.

19:03.000 --> 19:07.000
It allows you to collect click metrics.

19:07.000 --> 19:11.000
Imagine Google Analytics, but such specific.

19:11.000 --> 19:13.000
So, take a look at the project.

19:13.000 --> 19:18.000
It's quite interesting, at least for me, as a such engineer.

19:18.000 --> 19:20.000
But go, yeah.

19:20.000 --> 19:24.000
The most nice thing is for your data scientists,

19:24.000 --> 19:30.000
to be able to look at it at a very thick step back,

19:30.000 --> 19:32.000
have a holistic picture.

19:32.000 --> 19:36.000
We paint a nice little dashboard for it.

19:36.000 --> 19:39.000
I'm not going to go over what is here,

19:39.000 --> 19:46.000
but just to make a point that you can install the dashboards,

19:46.000 --> 19:49.000
and you get this nice stuff.

19:49.000 --> 19:51.000
And lastly, the hyper search optimizer,

19:51.000 --> 19:54.000
I told you the different combinations for the hyper parameters, right?

19:54.000 --> 19:56.000
So, when you actually run it,

19:56.000 --> 19:59.000
you can actually see, T-Towels is a such query.

19:59.000 --> 20:02.000
It ran for 0.8.2,

20:02.000 --> 20:07.000
including an harmonic mean, 0.5.5,

20:07.000 --> 20:09.000
including a geometric mean, so on and so forth.

20:09.000 --> 20:10.000
You get the idea, right?

20:10.000 --> 20:12.000
So, it ran for every single query,

20:12.000 --> 20:15.000
every single combination of that, right?

20:15.000 --> 20:18.000
And then it tells me, let's go to the dashboards.

20:18.000 --> 20:21.000
Once again, it tells me,

20:21.000 --> 20:25.000
which combination is the best combination,

20:25.000 --> 20:28.000
best hyper parameter combination for me?

20:28.000 --> 20:32.000
For my data set, with the judgment list that I am giving,

20:32.000 --> 20:34.000
which is basically the arithmetic mean,

20:34.000 --> 20:37.000
equilibrium 0.1 point, whatever, I'm not able to see it.

20:37.000 --> 20:42.000
So, this is the power of this such relevancy workbench,

20:42.000 --> 20:48.000
try it out, and I think I will have to change the screen.

20:48.000 --> 20:52.000
So, thank you so much.

20:52.000 --> 20:54.000
I ask for a given round of applause.

20:54.000 --> 20:56.000
Come on. Here's the great job.

20:56.000 --> 20:59.000
And the demo goes with with us.

20:59.000 --> 21:02.000
So, we saw, we'll show you the QR code.

21:02.000 --> 21:06.000
So, you can actually run the demo yourself using the dev tools,

21:06.000 --> 21:10.000
API, or of course, through the open-setch dashboards UI,

21:10.000 --> 21:12.000
if it's more comfortable for you.

21:12.000 --> 21:16.000
So, you can do this, and it doesn't connect to the screen,

21:16.000 --> 21:18.000
external screen.

21:18.000 --> 21:24.000
And next up, we will show us some more search highlights in V3.

21:24.000 --> 21:25.000
Go ahead.

21:25.000 --> 21:26.000
Yeah. Thanks, Jordan.

21:26.000 --> 21:29.000
So, I think we are running out of time,

21:29.000 --> 21:31.000
but I want to tell you this one, right?

21:31.000 --> 21:34.000
So, Jordan was talking about the PPL, the pipe processing language.

21:34.000 --> 21:39.000
So, this is more or less a kind of a new programming,

21:39.000 --> 21:41.000
not programming, but query language.

21:41.000 --> 21:45.000
So, what we have is also a natural language ability

21:45.000 --> 21:47.000
to convert the natural language into PPL query.

21:47.000 --> 21:51.000
Yeah. So, as you see here in this nice little animation, right?

21:51.000 --> 21:54.000
So, you don't have to, you know, for using it,

21:54.000 --> 21:57.000
you don't have to know the query, get go, right?

21:57.000 --> 21:59.000
You can learn it along the way.

21:59.000 --> 22:02.000
But for more advanced users, oh, there's also nice little dashboards

22:02.000 --> 22:06.000
that is automatically created when you use the natural language to PPL.

22:07.000 --> 22:10.000
But for most more advanced users, yeah,

22:10.000 --> 22:12.000
you can also create the complicated queries.

22:12.000 --> 22:15.000
I will leave that there. I'm not going to say what kind of complicated queries.

22:15.000 --> 22:19.000
But there are also new commands that you can use,

22:19.000 --> 22:21.000
like Ragex.

22:21.000 --> 22:23.000
You can also use a Grok pattern.

22:23.000 --> 22:28.000
You can also use ML for camis and randomback forests.

22:28.000 --> 22:31.000
There's a lot of new commands that are possible with PPL.

22:31.000 --> 22:35.000
But the most striking thing for me is the ability to join.

22:35.000 --> 22:39.000
So, join is a new command that is released in PPL.

22:39.000 --> 22:42.000
Something that is long standing often asked.

22:42.000 --> 22:47.000
Now, you are able to, you know, join multiple indices in one,

22:47.000 --> 22:49.000
one one one place, yeah.

22:49.000 --> 22:51.000
Oh, sub query is also possible.

22:51.000 --> 22:55.000
I want to go with the rest of the stuff here,

22:55.000 --> 22:58.000
but I'm going to pass it over to.

22:58.000 --> 22:59.000
Thank you.

22:59.000 --> 23:01.000
So, I want to talk a bit about performance,

23:01.000 --> 23:04.000
because, you know, we're all about performance

23:04.000 --> 23:08.000
as being a point of emphasis for the open search community since its origins.

23:08.000 --> 23:12.000
A lot of users actually choose open search for its ability to ingest

23:12.000 --> 23:16.000
and manage and explore massive amounts of data,

23:16.000 --> 23:18.000
but with massive amounts of data comes obviously,

23:18.000 --> 23:20.000
a premium of the performance.

23:20.000 --> 23:24.000
So, we strive to build the most performant search engine out there,

23:24.000 --> 23:26.000
and which is obvious since version one.

23:26.000 --> 23:29.000
And this is why I brought to this plot from one.

23:29.000 --> 23:30.000
One.x to three.

23:30.000 --> 23:31.000
So, three.x, you can see.

23:31.000 --> 23:37.000
And actually, the gains have been so big that we build a logarithmic y-axis.

23:37.000 --> 23:39.000
So, when you see the graph going down,

23:39.000 --> 23:43.000
it's actually means that it's been decreasing exponentially.

23:43.000 --> 23:48.000
And, in fact, one that is across average across all query types,

23:48.000 --> 23:51.000
which is the big five workload that you probably know the benchmark,

23:51.000 --> 23:53.000
that's the dotted line in the middle,

23:53.000 --> 23:58.000
is improved 10 times from one.3 to three.x.

23:59.000 --> 24:01.000
So, pretty nice things to see.

24:01.000 --> 24:04.000
And by the way, it's not just about the query latency.

24:04.000 --> 24:09.000
We also see significant performance in the ingestion,

24:09.000 --> 24:13.000
in the storage, in throughput, in the transport.

24:13.000 --> 24:17.000
The dozens of new features in version three that contribute

24:17.000 --> 24:19.000
to these performance booths.

24:19.000 --> 24:21.000
So, we're running out of time.

24:21.000 --> 24:23.000
So, I won't go over all of these,

24:23.000 --> 24:25.000
but just to give you some highlights.

24:25.000 --> 24:26.000
First, the GPU acceleration.

24:26.000 --> 24:30.000
We only know here in the search dev room that vector operations

24:30.000 --> 24:35.000
are heavy, especially the distance calculations and the K&N.

24:35.000 --> 24:38.000
But, on the other hand, they lend themselves for parallel computation

24:38.000 --> 24:39.000
with GPUs.

24:39.000 --> 24:42.000
So, as part of the project, we've introduced

24:42.000 --> 24:45.000
a ability to leverage Nvidia architecture,

24:45.000 --> 24:48.000
the CVS Cagra algorithm,

24:48.000 --> 24:50.000
which shows pretty impressive results.

24:50.000 --> 24:55.000
You can see 9.3x improvement there on the indexing speed.

24:55.000 --> 24:57.000
Cost by nearly four times.

24:57.000 --> 24:59.000
That's on the GPU acceleration.

24:59.000 --> 25:01.000
Another cool thing is,

25:01.000 --> 25:03.000
we've seen over-ficed,

25:03.000 --> 25:05.000
because we know that we've seen,

25:05.000 --> 25:07.000
many of us know we've seen,

25:07.000 --> 25:09.000
it's memory-efficient, but doesn't scale very well.

25:09.000 --> 25:11.000
And on the other hand,

25:11.000 --> 25:13.000
scales very nicely, but then again,

25:13.000 --> 25:15.000
you need to load the entire

25:15.000 --> 25:17.000
Agenus W graph into memory.

25:17.000 --> 25:19.000
So, we came up with this clever thing of

25:19.000 --> 25:20.000
Lucine over-ficed.

25:20.000 --> 25:22.000
It gives the best of both worlds.

25:22.000 --> 25:24.000
Try it out, pretty new,

25:24.000 --> 25:25.000
learned for the CV.

25:25.000 --> 25:28.000
The initial test show impressive results.

25:28.000 --> 25:31.000
GRPC, with Protobuff,

25:31.000 --> 25:33.000
again, a very good way to,

25:33.000 --> 25:36.000
for faster, more efficient data transport

25:36.000 --> 25:40.000
and data processing compared to the JSON and rest.

25:40.000 --> 25:42.000
And especially with the

25:42.000 --> 25:44.000
significantly heavier operations,

25:44.000 --> 25:46.000
like bulk indexing, K&N,

25:46.000 --> 25:48.000
search and all of these.

25:48.000 --> 25:50.000
And also, it enables

25:50.000 --> 25:54.000
use of platforms such as Apache arrow and arrow fly.

25:54.000 --> 25:56.000
So, arrow, I don't know,

25:56.000 --> 25:58.000
the great talk yesterday for those

25:58.000 --> 25:59.000
attending databases there,

25:59.000 --> 26:00.000
but essentially,

26:00.000 --> 26:02.000
it eliminates the serialization and

26:02.000 --> 26:04.000
serialization overhead.

26:04.000 --> 26:06.000
And also, it allows for streaming architecture

26:06.000 --> 26:08.000
with processing aggregations,

26:08.000 --> 26:10.000
because it can allow to deliver

26:10.000 --> 26:13.000
partial results to boost complex aggregations.

26:13.000 --> 26:15.000
I took mentioned briefly before about

26:15.000 --> 26:17.000
the reader write a separation,

26:17.000 --> 26:19.000
so that you can separate the search and the indexing

26:19.000 --> 26:22.000
workloads to avoid the congestion and get

26:22.000 --> 26:25.000
more predictable performance.

26:25.000 --> 26:26.000
And many more, again,

26:26.000 --> 26:28.000
I won't have time to go over all of these.

26:28.000 --> 26:30.000
So, I want in the,

26:30.000 --> 26:33.000
to say a word about what's coming next.

26:33.000 --> 26:35.000
The opposite project builds on an eight week

26:35.000 --> 26:37.000
release train, so there's always

26:37.000 --> 26:40.000
good stuff coming around the corner.

26:40.000 --> 26:41.000
And in the coming month,

26:41.000 --> 26:43.000
we're going to release 3.5.

26:43.000 --> 26:45.000
And I wanted to just give you a taste of

26:45.000 --> 26:47.000
some of the innovations in search and

26:47.000 --> 26:48.000
observability.

26:48.000 --> 26:50.000
It is the ones that I particularly like.

26:50.000 --> 26:52.000
So, you know,

26:52.000 --> 26:53.000
but just 3.3,

26:53.000 --> 26:54.000
before last release,

26:54.000 --> 26:56.000
we've added skip lists for

26:56.000 --> 26:57.000
date histograms,

26:57.000 --> 26:58.000
for those who don't know.

26:58.000 --> 27:00.000
So, with 3.5,

27:00.000 --> 27:02.000
we're going to add skip list support for

27:02.000 --> 27:03.000
aggregations.

27:03.000 --> 27:05.000
So, for analytics use cases,

27:05.000 --> 27:07.000
this offers alternative for the

27:07.000 --> 27:09.000
multi-range traversal,

27:09.000 --> 27:11.000
that's much more efficient for

27:11.000 --> 27:13.000
certain frequently used query types.

27:13.000 --> 27:15.000
For example, if you have a query that

27:16.000 --> 27:18.000
filters on one field and then

27:18.000 --> 27:20.000
searches on another,

27:20.000 --> 27:21.000
things like that,

27:21.000 --> 27:24.000
we really see at least two times faster

27:24.000 --> 27:25.000
and so on.

27:25.000 --> 27:27.000
Another one is that the next release

27:27.000 --> 27:29.000
will introduce APM,

27:29.000 --> 27:30.000
a support application,

27:30.000 --> 27:32.000
performance monitoring with

27:32.000 --> 27:33.000
support for

27:33.000 --> 27:34.000
a permit years,

27:34.000 --> 27:35.000
for a metric data,

27:35.000 --> 27:37.000
open telemetry,

27:37.000 --> 27:39.000
as the instrumentation standard,

27:39.000 --> 27:41.000
so APM support really cool

27:41.000 --> 27:43.000
think coming next.

27:44.000 --> 27:45.000
And last but not least,

27:45.000 --> 27:46.000
it 3.5,

27:46.000 --> 27:47.000
we're going to introduce

27:47.000 --> 27:49.000
agent observability for

27:49.000 --> 27:50.000
root cause analysis.

27:50.000 --> 27:51.000
So, that's pretty cool,

27:51.000 --> 27:53.000
not having your agents as

27:53.000 --> 27:54.000
black boxes,

27:54.000 --> 27:55.000
we'll be able to analyze the

27:55.000 --> 27:56.000
performance,

27:56.000 --> 27:58.000
and refine our prompts,

27:58.000 --> 27:59.000
based on that.

27:59.000 --> 28:00.000
So, it's going to be released

28:00.000 --> 28:01.000
as experimentally,

28:01.000 --> 28:03.000
we want folks to start getting their hands on

28:03.000 --> 28:04.000
and give us feedback on the

28:04.000 --> 28:05.000
stand.

28:05.000 --> 28:06.000
It's all new to all of us,

28:06.000 --> 28:07.000
of course.

28:07.000 --> 28:08.000
So, to get a sense of this is

28:08.000 --> 28:10.000
the right direction as the community.

28:10.000 --> 28:11.000
And the old 3.5,

28:11.000 --> 28:12.000
there's a lot more,

28:12.000 --> 28:14.000
to talk about decoupling from

28:14.000 --> 28:16.000
Lucene and opening up for

28:16.000 --> 28:17.000
composable,

28:17.000 --> 28:19.000
pluggable architecture to introduce

28:19.000 --> 28:20.000
more engines,

28:20.000 --> 28:21.000
and more format,

28:21.000 --> 28:22.000
for example,

28:22.000 --> 28:23.000
per k,

28:23.000 --> 28:24.000
as a support,

28:24.000 --> 28:25.000
and much more.

28:25.000 --> 28:27.000
So, do stay tuned,

28:27.000 --> 28:28.000
and where is that,

28:28.000 --> 28:29.000
where it is,

28:29.000 --> 28:30.000
of course,

28:30.000 --> 28:31.000
the GitHub repo,

28:31.000 --> 28:32.000
and GitHub org with

28:32.000 --> 28:33.000
140 repos,

28:33.000 --> 28:34.000
actually,

28:34.000 --> 28:36.000
a very active Slack channel,

28:36.000 --> 28:37.000
instance,

28:37.000 --> 28:38.000
with thousands of participants,

28:38.000 --> 28:40.000
with our forums,

28:40.000 --> 28:41.000
and there's also the tags,

28:41.000 --> 28:42.000
technical advisory groups.

28:42.000 --> 28:44.000
I'm honored to be a lead for

28:44.000 --> 28:45.000
the observability tag.

28:45.000 --> 28:47.000
We have a monthly calls open for

28:47.000 --> 28:48.000
everyone,

28:48.000 --> 28:49.000
and Slack channel for that.

28:49.000 --> 28:50.000
So, if you're into that,

28:50.000 --> 28:51.000
do join.

28:51.000 --> 28:52.000
We have us,

28:52.000 --> 28:53.000
groups around the world,

28:53.000 --> 28:54.000
around 19 countries,

28:54.000 --> 28:55.000
I think, around the world.

28:55.000 --> 28:56.000
And also,

28:56.000 --> 28:58.000
the open source content events

28:58.000 --> 28:59.000
around the world.

28:59.000 --> 29:00.000
I just came back last month

29:00.000 --> 29:01.000
from OpenSedgeConjapan.

29:01.000 --> 29:03.000
That's the bonsai tree here.

29:03.000 --> 29:04.000
And we're going to have

29:04.000 --> 29:05.000
openSedgeCon Europe,

29:05.000 --> 29:06.000
assuming most of you are

29:06.000 --> 29:07.000
based in Europe,

29:07.000 --> 29:08.000
in April,

29:08.000 --> 29:09.000
in Prague.

29:09.000 --> 29:10.000
So, you're all invited to join,

29:10.000 --> 29:11.000
by the way.

29:11.000 --> 29:13.000
It has a special track for a patchy

29:13.000 --> 29:14.000
loose scene and search.

29:14.000 --> 29:15.000
So, even if you're not

29:15.000 --> 29:16.000
specifically with OpenSedge,

29:16.000 --> 29:17.000
but the broader search

29:17.000 --> 29:18.000
community,

29:18.000 --> 29:19.000
we welcome your participation,

29:19.000 --> 29:21.000
also CFP for the following

29:21.000 --> 29:22.000
ones,

29:22.000 --> 29:23.000
check it out.

29:23.000 --> 29:25.000
And check out the QR code,

29:25.000 --> 29:26.000
all the links are there

29:26.000 --> 29:27.000
in one page,

29:27.000 --> 29:28.000
including some other stuff,

29:28.000 --> 29:29.000
like roadmap,

29:29.000 --> 29:31.000
like the playgrounds

29:31.000 --> 29:32.000
that you can play online,

29:32.000 --> 29:33.000
and so on.

29:33.000 --> 29:34.000
And of course,

29:34.000 --> 29:35.000
Ash Wattenme,

29:35.000 --> 29:36.000
are here,

29:36.000 --> 29:37.000
happy to answer any questions

29:37.000 --> 29:38.000
after the talk.

29:38.000 --> 29:39.000
So, after the conference,

29:39.000 --> 29:40.000
so we put the QR code,

29:40.000 --> 29:41.000
so you can contact us

29:41.000 --> 29:42.000
also online,

29:42.000 --> 29:43.000
if we don't get to chat here.

29:43.000 --> 29:44.000
So, feel free to contact

29:44.000 --> 29:45.000
we'd love to hear your

29:45.000 --> 29:46.000
question,

29:46.000 --> 29:47.000
you feedback about this talk,

29:47.000 --> 29:48.000
and anything in between.

29:48.000 --> 29:49.000
I'm Dotaan,

29:49.000 --> 29:50.000
this is Ash Wattenme.

29:50.000 --> 29:51.000
Thank you very much for listening.

29:51.000 --> 29:52.000
Thank you.

29:52.000 --> 29:55.000
Do you have 30 seconds?