WEBVTT

00:00.000 --> 00:15.200
So, it's just time and I will start now, so I'm too to and I will present to you the

00:15.200 --> 00:18.520
USB-D 2.0 device class in Zephyr.

00:18.520 --> 00:22.880
So, first of all, Holtz started on why I'm talking to you today.

00:22.880 --> 00:29.680
So, it's a friend of his amylization and he had a project where he has a custom controller.

00:30.400 --> 00:37.920
The goal is to have a performance where he moves on top of that so we have some old distance sensors

00:37.920 --> 00:42.240
like IRB's.

00:42.240 --> 00:47.120
The problem is it's all analog and we have lots of cables, so it's very noisy and it's

00:47.120 --> 00:51.600
difficult to have some precise movement and also there isn't any feedback for him to see

00:51.600 --> 00:54.480
anything because it's an input only thing.

00:54.480 --> 01:00.320
So we built a very first prototype, so we see the box.

01:00.320 --> 01:06.920
We now have touchpad RGB, some encoder with an RGB light as well and some new fancy

01:06.920 --> 01:14.040
time of flight sensor and soon enough we had a custom prototype, so you can see I'm not a PCB

01:14.040 --> 01:21.320
designer probably but yeah, this is working and so we started to do things, make some

01:21.400 --> 01:27.600
sounds, we also broke some parts by the way, that's happens.

01:27.600 --> 01:35.040
But then finally it is the final case with the nice casing and the distance sensors are

01:35.040 --> 01:38.720
on the top.

01:38.720 --> 01:44.320
So why is Zephyr here, lots of talk about the device pretty today but definitely I like

01:44.320 --> 01:49.000
it, but also the key config system, the west tooling etc.

01:49.000 --> 01:53.440
It's free and up and so on, so I can read the code and simply update it if I want, there's

01:53.440 --> 01:57.800
a large collection of drivers, we can just try different components and try and swap

01:57.800 --> 01:59.800
them out.

01:59.800 --> 02:05.520
Also Zephyr comes with some subsystem and OS services like you have the USB subsystem

02:05.520 --> 02:11.880
built in, we also have the logging, the shell etc etc.

02:11.880 --> 02:17.680
And if you ever worked with Linux, then it's definitely a different code base but you will

02:17.760 --> 02:22.360
feel at home in Zephyr.

02:22.360 --> 02:28.880
But there wasn't any MIDI support in Zephyr, so let's build it.

02:28.880 --> 02:35.480
Let's first of all start with the very old MIDI 1 protocol from the 80s, so it's a very

02:35.480 --> 02:43.000
old thing and the other hood, it's simply a serial link on which we can convey at two 16

02:43.000 --> 02:45.680
channels of instruments I would say.

02:45.720 --> 02:52.040
You may recognize the Iconic D5 connector, which was at the time pretty much all the

02:52.040 --> 02:59.200
rage, but now it's basically only the MIDI Iconi in every software you see.

02:59.200 --> 03:03.560
And basically it's a protocol that's allows to connect some controllers like a keyboard

03:03.560 --> 03:09.720
and knobs and fliders to some synthesizers, so things that we'll produce sounds or maybe

03:09.720 --> 03:14.640
do other things.

03:14.680 --> 03:19.280
So what's coming on this serial link, so there are some messages, but the most important

03:19.280 --> 03:22.680
thing to remember is that it's a stream of bytes.

03:22.680 --> 03:28.160
The first one will always have the first bits set to one, which means it's a start of

03:28.160 --> 03:32.760
the packets and all the other ones will be on seven bits with the highest one being zero.

03:32.760 --> 03:37.480
So we have a different type of messages, but the most common one are not on or not

03:37.480 --> 03:43.520
off, which start by eight or nine and then the channel number, which can go from zero to

03:43.560 --> 03:48.800
16, then the note number and then the velocity, the velocity being like a whole heart,

03:48.800 --> 03:53.720
you press on the key for instance, you can also send some control changes, which is what

03:53.720 --> 04:00.320
we use in our controller, so knobs and fliders, simply a value, and also this has which

04:00.320 --> 04:05.920
control number, so which knob you're actually rotating and again the value.

04:05.920 --> 04:11.200
Also you can tell a synthesizer to change its program, so to move from a sound, which

04:11.200 --> 04:15.960
is a piano to the sound from a duck, I don't know, to the pitch band, which is like

04:15.960 --> 04:21.600
if you have a guitar, you push the strings, so you pitching up or down the sound, etc.

04:21.600 --> 04:27.760
There are also the system exclusive, which allow you to send arbitrary payload, this one

04:27.760 --> 04:35.400
is a bit special, because you have a start by it, which is f0, then f7 to completely

04:35.720 --> 04:43.000
so just a few examples, to play a note to sound the message 90, 46, which means play a sharp

04:43.000 --> 04:48.840
of 4 on channel 1, by the way, as programmer, we start counting a zero, but musicians

04:48.840 --> 04:54.360
start at 1, it will be the same for all the slides, we can also say, hey, synthesizer, please

04:54.360 --> 05:04.040
load that program or I can even send a nascist string as a 6-6 message, so just to be clear,

05:04.040 --> 05:10.360
if we look at the midi sequence or something, the notes on event is sent here, to tell

05:10.360 --> 05:15.000
I start playing the notes, and then we have the notes of event, when you stop playing the notes,

05:15.000 --> 05:19.000
so it's like when you press the keyboard and release the keyboard, and also the velocity, which

05:19.000 --> 05:26.840
indicates, how hard you press the keys. So we are known in the 90s and the 90s, if

05:26.840 --> 05:32.920
you want, and so there is this new thing called USB, everyone has it, and it's fun, and so we

05:32.920 --> 05:42.520
are going to connect midi to the USB, so the specification is curious, I would say, but there's

05:42.520 --> 05:50.920
the USB midi function, which has some inputs and outputs, so we have the embedded connectors,

05:50.920 --> 05:57.080
which are actually the one that goes to the host, and then we have the external connectors that

05:57.080 --> 06:03.480
go to the outside world. And so an embedded midi is actually something coming from the host, which

06:03.480 --> 06:11.640
ultimately will be a note put. Yeah, so for example, if you look at what happens in this

06:11.640 --> 06:17.400
midi interface, USB midi interface, this means that if we have the input from the keyboard,

06:17.480 --> 06:23.320
it corresponds to an external midi that goes to an embedded midi out from midi in and points,

06:23.320 --> 06:33.480
then to the host. In notes, and you get other way around, if you have an external midi output

06:33.480 --> 06:41.080
to do some synthesizer, this is how it looks like. So from the specification, we can represent a lot

06:41.080 --> 06:47.880
of different topologies, so there's a typical midi in out, but also you have the opportunity

06:47.880 --> 06:52.920
to describe something that's simply conveys midi from the external world to the external world,

06:52.920 --> 06:59.720
without going to your PC, you can tell that's midi, USB midi one, or also this complex example

06:59.720 --> 07:09.000
from the specification. But in practice, I've always seen, always this one used, I've never seen

07:09.160 --> 07:17.400
any other topologies being used in the wide. So in 2020, came the USB midi two specification,

07:17.400 --> 07:24.440
which I've compromised midi two specification. So it's very simplified, as you can see,

07:24.440 --> 07:30.760
we don't have all those external embedded and going from in and out and a lot of

07:30.760 --> 07:38.440
tringling around. So it's just, no, we have some group terminals, and so group terminals linked

07:38.440 --> 07:48.520
the USB to the midi function I would say. And in this context for instance, we have no a pair of

07:48.520 --> 07:56.360
output, and those two pair belong to the same block. So they form like 32 channels that are together

07:56.360 --> 08:03.800
two times 16. So you can have like wider outputs for some complex interchange or stuff.

08:04.680 --> 08:13.240
And the same way, as input for instance, so this described that we have one thing, which is one

08:13.240 --> 08:19.000
only thing on the instrument or something that goes through all these ports. And this is something

08:19.000 --> 08:27.560
you can't tell in USB midi one. And so what do we send through all these pipes? It's the

08:27.560 --> 08:34.680
universal midi packet. So no, it's aligned on four bytes, and it's one to four of these words

08:35.400 --> 08:43.000
along together. So we have those function blocks, which are the the group terminal, we

08:43.000 --> 08:54.280
allocated in the previous slide. And this packet can carry either midi one or midi two messages.

08:54.280 --> 09:02.520
In midi two, we have an extended numerical range. We can have some notes attribute, so we can

09:02.520 --> 09:08.680
specify some detunes or some tuning that are represented in midi one. For instance, in non-Western

09:08.680 --> 09:17.240
music, we have notes that can be represented in midi one. And also things like calls, tempo,

09:17.240 --> 09:25.800
time signatures, we can even do karaoke over midi two. But the main point here is that the

09:25.800 --> 09:31.320
midi two specification acknowledge that modern device communicate on high-speed link and that

09:31.400 --> 09:39.160
are bidirectional like USB. Just one point, midi two is not USB midi two, which itself

09:39.160 --> 09:44.840
is not USB two. So you can, for instance, send midi one message with USB midi two on USB one.

09:45.880 --> 09:52.120
Just a week here. So just to be very clear, midi two is the presentation layer. USB midi two is

09:52.200 --> 10:01.640
the transport layer. And then USB two point is all the left-house view. So some example of those

10:01.640 --> 10:07.800
universal midi packets, which are not aligned on four bytes, they all have a field which is called

10:07.800 --> 10:15.000
the message type, which identifies, yeah, what's inside this thing, in which specified the length

10:15.320 --> 10:23.080
of the message. So we have the midi one message, which starts like before, message type two,

10:23.080 --> 10:29.880
then the group, which is on which output we're going to send is. And then it's actually

10:29.880 --> 10:35.400
exactly a midi one message like we saw a few slides before. Then we have midi two, the same message,

10:35.400 --> 10:41.560
not on the top. But you can see that we now have an extended value for the velocity, which is

10:41.560 --> 10:48.360
known in 16 bits. And we have some attributes which are related to all those two things. I told

10:48.360 --> 10:54.760
about not going into the details, but the main idea is that you can put more data, I would say,

10:54.760 --> 11:00.360
in the notes. And then we have some utility messages, like with the message type f,

11:01.560 --> 11:08.920
UMP stream, which allows the host to discover some midi device and possibility to configure

11:08.920 --> 11:17.160
them the dynamically to allocate some synthesizer or all things. Finally, one last word about

11:17.160 --> 11:23.080
USB, so I mean the embedded rumor, I assume that you know a bit about that, but just to know

11:23.080 --> 11:29.320
on the USB bus, if you are device, you must expose tons of descriptors, like it's just say to the

11:29.320 --> 11:36.120
host, hey, if you find me, I'm capable of this, this and this. And also, I have this and point,

11:36.120 --> 11:45.800
so this pipes to communicate from the device to the host or the URL. So now, overall, we have seen

11:45.800 --> 11:52.200
all that. We must expose all this ton of descriptors to the host, so it can find or device.

11:52.840 --> 11:59.320
We must respond to class specific requests, so the host will ask us, hey, what's your topology,

11:59.320 --> 12:04.520
and I will answer with the map of my inputs and outputs, which is the case in midi two.

12:05.240 --> 12:12.040
USB midi two, excuse me, and then we can exchange some universal in the duplicates on this

12:12.040 --> 12:21.960
endpoint. So in Zephyr, we have USB device next, which is the new API for implementing this

12:21.960 --> 12:31.720
b device. So what we have as an implementer, the hardware window will provide us or some

12:31.720 --> 12:39.080
is people. The USB device controller, which is the part responsible for actually getting the

12:39.080 --> 12:45.880
wires at the correct voltage and sending some data. And on top of that, you are going to write

12:45.880 --> 12:55.160
some classes. Usually, when you write a USB device class, you will wrap it into a Zephyr device,

12:56.200 --> 13:04.600
which allows the end user application to simply use device dtget and get a handle to your thing.

13:05.400 --> 13:12.040
And all this is managed by the USB D context, which manage all the wiring, making sure that stuff

13:12.680 --> 13:19.880
enabled at the right time, that the descriptors are open together. I have to say it's much more

13:19.880 --> 13:26.760
easier than with the old USB stack, where you have to do all of this manually. So the operational

13:26.760 --> 13:35.640
models, so we are going to exchange up to 16 UMP groups between our Zephyr application and the host.

13:36.440 --> 13:45.560
So to do this, we can receive them by registering the callback. And we are, it's a bit off,

13:45.560 --> 13:56.440
but we can just use USB D to send all of that. And all of this happens through the USB

13:56.440 --> 14:03.880
dv device that we get from the device 3. So if you look at the descriptors, unfortunately,

14:03.880 --> 14:09.400
the specification requires us to have first a fully compliant MIDI streaming interface from

14:09.400 --> 14:17.880
the USB MIDI 1. So in that version, we had to put all the jack-in-out embedded external

14:19.000 --> 14:26.360
inside the USB descriptors domain USB descriptors, I would say. But in this implementation,

14:26.360 --> 14:33.480
I just choose to put an empty class. So there's a USB MIDI 1 interface, but it's non-functional.

14:34.200 --> 14:41.160
It's functional, but it hasn't any input or outputs. And then we have the MIDI streaming

14:41.160 --> 14:49.160
for USB MIDI 2, which staying contains only the basic definition. We just send the topology

14:49.880 --> 14:56.440
through some class specific requests. But this implies that your host PC has to select the

14:56.440 --> 15:02.920
else setting number one to enable MIDI 2. And this is true for any USB MIDI interface.

15:04.200 --> 15:13.240
So let's make some noise. First of all, we just enable USB device tech next in our

15:13.240 --> 15:20.520
config and USB D MIDI 2 class. That's all the project configuration. And then in the device 3,

15:20.600 --> 15:30.120
we are going to insert some noise, which is a Z4 MIDI 2 device. And then we can declare

15:30.120 --> 15:37.640
our input and output blocks. Like this, we just put some Chinese insights. So for instance,

15:37.640 --> 15:46.440
here the block one here has two bi-directional input bi-directional connectors. And so we have

15:46.840 --> 15:53.240
it started the group 0 for us programmers, which means one from musicians. And spans two

15:53.240 --> 16:01.240
groups, which means 32 channels, 2 times 16. And then we also have the block 3 here. I skipped the

16:01.240 --> 16:09.400
block 2 for a confusion, but you get the AD, which starts at two transitions. And this one's

16:09.400 --> 16:17.480
suddenly capable to send MIDI 1 data. And it's an input only from the perspective of the host.

16:20.360 --> 16:28.440
Then we can register some event handlers on the USB device, the USB MIDI device. So we have an

16:28.440 --> 16:35.800
event on ver if our dear host select the right interface. And USB is correctly enabled. And so for

16:35.880 --> 16:42.520
instance, we can simply light up the led when the host has selected the interface USB MIDI 2.

16:44.680 --> 16:51.160
And we simply just say to our MIDI device, yeah, I want you to have these operations, like it's

16:51.160 --> 16:57.880
usually the case in that section of the code. And we are going to see this one just a bit after,

16:57.880 --> 17:05.640
but that's not just a good fast. So if you want to implement a simple MIDI keyboard, so an input

17:06.440 --> 17:13.480
coming from the external in and going to a host, it's actually fairly easy. So we have our USB

17:13.480 --> 17:19.720
MIDI device, which we get from the device free. We are going to use the input subsystem from Ziffer.

17:19.720 --> 17:25.000
So actually it's very easy, you just enable it. And when you press a button, you can simply have

17:25.000 --> 17:34.440
a call back that's called. Then we simply build the UMP packet. So if we do a key press, it's an

17:34.440 --> 17:41.320
atone. If we do a key release, it's not off. And then we simply build the thing. So I've just added some

17:42.440 --> 17:47.800
small models. So you can even understand what happens. Actually, not every detail is very important

17:47.800 --> 17:56.040
here, but you get the ID. And the other way around. So if we want to get some events coming from the host,

17:56.200 --> 18:05.880
now, we can choose to only respond to MIDI 2, this time, MIDI 2 packets. We get the channel.

18:05.880 --> 18:15.640
So as I recall, there are two groups. So we have up to 32 channels. And then we just look that

18:15.640 --> 18:22.440
which type of MIDI 2 event it is, and atone or not off. In which case we can just start

18:22.520 --> 18:28.920
play some tone or stop playing it with notes that has been given in the packets, as well as

18:28.920 --> 18:36.360
the velocity, which I recall is how hard you press a key. Again, I have some models, so you can

18:36.360 --> 18:44.040
follow it later on. So, it's time for a demo. I have this board on my desk here. So I have a

18:44.040 --> 18:51.240
bitteria, so you can even understand what happens. So there's the SD link on the left, which just

18:51.320 --> 18:57.240
is, which is just used for power today. We have the user button, which will serve me as the input

18:57.240 --> 19:03.880
keyboard. We have the LED, which is green, if we have USB MIDI 2 enabled, then we have the famous

19:03.880 --> 19:15.720
USB MIDI going to my computer and maybe some audio outputs. Now, you have to tell me if that's

19:15.800 --> 19:28.680
readable. I will actually try to put that. So, I've scripted it, I've scripted it, so it's a bit

19:28.680 --> 19:37.080
easier for me to present. So let's first start by finding all the device. On Linux, we can just

19:38.200 --> 19:44.680
look at the other sequential appliance. And we see that you have like all the blocks I presented

19:44.680 --> 19:54.440
in the other slide. I can read some events in the basic row, UMP formats, directly from the FSD,

19:54.440 --> 20:02.280
UMP blah blah blah. And so I can also record them here on my MIDI Sequencer. So if I play,

20:02.280 --> 20:07.480
I'm just like press the button, you can see the notes on, the notes off, that's impressive, no?

20:07.480 --> 20:20.920
Also, we can see it's in the old other formats. So we have like the two groups, I will stop this.

20:23.160 --> 20:29.720
Oh, yeah, I had a timer, so I didn't demo that you can also read them in the old formats.

20:29.800 --> 20:41.080
Oh, yeah, I should pop my speaker now. So yeah, probably if I just had that, I can play from

20:41.080 --> 20:50.600
my sequencer and it doesn't work. And maybe I haven't selected the right outputs. I haven't selected

20:50.680 --> 20:58.360
any input for two. So yeah, it's very, very impressive, no?

21:03.080 --> 21:08.600
So, okay, I can play some note like this, but I can also, yeah, play some notes from all

21:08.680 --> 21:13.240
sides. I used that picture, but I can also replay you.

21:29.240 --> 21:34.440
And so that's what's my presentation. The main points I would like to tell

21:35.000 --> 21:41.800
it's just been merged yesterday, evening. So we know you have USB-C2 coming in the

21:41.800 --> 21:49.160
for a 4.1. I hope that we can do some fancy thing. You can find me on the Zephyr Discord

21:49.160 --> 21:57.080
as I do and all those platforms as well. And if you have any question, I'd be glad to have them.

21:57.160 --> 22:00.200
Question.

22:05.560 --> 22:13.320
Thank you. That was quite interesting. Do you know how far the penetration of MIDI 2.0

22:13.320 --> 22:16.680
USB-Med2.0 in the other device segments is?

22:17.400 --> 22:27.000
So I don't know really well. I can't tell even the vendor who put them. It's very

22:27.000 --> 22:33.640
new, so it's from the 2020. So until the spec is completely released, the vendor wants to implement it.

22:33.640 --> 22:37.880
And then there's the development time and the time to market. So we are only seeing the first

22:37.880 --> 22:44.520
you MIDI device coming up. I've never seen it once, so all the thing I developed here is like

22:44.600 --> 22:49.160
I've just read the specification, tried to figure out if that matches what the specification does.

22:51.160 --> 22:58.680
So I'm not a product person. Any more questions?

22:59.640 --> 23:18.760
If I want to adapt this to my existing keyboard and hardware, what would I need to change

23:18.760 --> 23:22.360
because it's older than 2022?

23:24.360 --> 23:29.240
So if you have a keyboard which has a MIDI outputs, unless you're prepared to

23:29.240 --> 23:35.560
unsold or you're the microcontroller inside, we sold or won from you, being able to

23:35.560 --> 23:41.560
reprogram it using the server because I have a stack of whatever you want. This is what it would take.

23:41.560 --> 23:47.960
I mean it's software, so somehow you need to update the software of the thing running on your

23:47.960 --> 23:54.600
keyboard. I mean it's Sapphire also able to talk the old MIDI protocol.

23:54.600 --> 24:03.320
Yeah, so there's the old MIDI one like the physical protocol which is a serial link and this

24:03.320 --> 24:09.320
is not what my presentation is about. It's a very simple UI, so you just need to implement a very

24:09.320 --> 24:15.800
simple parser. And then if you want to plug it to the USB, you just wrap it into UMPs

24:15.960 --> 24:21.720
and use the Ziffer USB MIDI API to send them to the host if that's something you're up to.

24:21.720 --> 24:26.360
But if you have an existing device and expect it to speak MIDI too, then you will need to

24:26.360 --> 24:32.280
reprogram it somehow. Yeah, sure. Now I was fine with the first use case. Yeah. Thank you.

24:37.240 --> 24:39.240
You can ask the question I will repeat.

24:40.200 --> 24:46.680
Is it possible to use the pair to translate from MIDI to the zero?

24:47.080 --> 24:51.480
So the question was, is it possible to use Ziffer to translate from MIDI 1 to MIDI 2?

24:52.040 --> 24:59.640
So translating from MIDI 1 to MIDI 2 is actually quite easy. You use the 7 bits for instance from

24:59.640 --> 25:05.080
the velocity. You shift them by 9 to the left and this gives you the velocity on 16 bits.

25:05.720 --> 25:15.480
So sure, you can do that. The other way around is a bit, it's possible, but you lose either

25:15.480 --> 25:23.000
the resolution or also to not attribute the two links and stuff. So going from MIDI 1 to MIDI 2 is

25:23.000 --> 25:27.320
like trivial, but the other way around is not possible. You lose some information.

25:28.600 --> 25:31.720
That's all from me. Thank you very much. Have a nice talk to everyone.

25:35.080 --> 25:37.320
Thank you.

