WEBVTT

00:00.000 --> 00:12.000
Our next talk, our next speaker, rather, is Ivan Pumman of, let's see, now, now, now, now,

00:12.000 --> 00:21.000
but Pumman Maria, presenting, made a documentation from DSL to dynamic docs with ASCII

00:21.000 --> 00:29.000
doc and Antora. Please welcome Ivan. Thank you. Thank you. Thank you. Thank you.

00:29.000 --> 00:37.000
Yeah, we're entering into 4th hour of this room. Things are getting difficult. Yeah, thank you for coming.

00:37.000 --> 00:43.400
So, yeah, I'm Ivan and currently I'm working as a team leader at technical startup called

00:43.400 --> 00:52.000
synthesized in London, we're producing test data automation you can trust, but also I'm teaching computer science and Java

00:52.000 --> 01:00.000
at number of universities. So, actually I also identify myself as a software engineer. I probably

01:00.000 --> 01:07.000
produce more documentation, technical documentation and the slides than actual code in my

01:07.000 --> 01:15.000
board. So, I would like to share the experience with the product that I'm currently working

01:15.000 --> 01:24.000
at this synthesized company in particular, how we document it. So, what this product does? Like,

01:24.000 --> 01:30.000
roughly speaking, it transforms and generates data in relational databases. So, it's core.

01:30.000 --> 01:36.000
It has a number of things that we call transformers or generators. There are, like, several

01:36.000 --> 01:43.000
kinds of them. So, to give you some example, it can be just random number generator with some, you

01:43.000 --> 01:48.000
know, distribution statistical distribution. Or it can be a person generator which will

01:48.000 --> 01:57.000
mock up somebody's gender age name and some, and so on so forth. Or we can give it several categories give them,

01:57.000 --> 02:04.000
give it, you know, probabilities and it will generate, you know, categories and stuff. So, and more,

02:04.000 --> 02:12.000
there are 35 of them and the whole product on top. So, what do we document? What is the surface,

02:12.000 --> 02:19.000
like, user-facing surface of the product? First of all, things can be configured via

02:19.000 --> 02:27.000
YAMO file or they can be configured via user interface, the editor. And of course, usually

02:28.000 --> 02:34.000
this product evolves over time. It's several years old now. So, new transformers are added.

02:34.000 --> 02:41.000
Like, features are changed, bugs are fixed. So, we need to keep a number of things in sync,

02:41.000 --> 02:46.000
actually, in this product. So, the things that we must, that must stay in sync. First,

02:46.000 --> 02:52.000
is Jason schema for the YAMO configuration. Of course, we want to validate this YAMO before even

02:52.000 --> 02:58.000
we start, like, feeding it to our product. And we want context sensitive hints in editors.

02:58.000 --> 03:04.000
So, we need this Jason schema. Always up to date. Also, the projectional editor UI,

03:04.000 --> 03:10.000
which actually follows the same, like, yeah. I believe, like, people who are in this room,

03:10.000 --> 03:16.000
you are big fans of everything is caught, but not every users. A big fans of coding.

03:16.000 --> 03:23.000
So, people want this editors. And we must keep them in sync as well. So, how do we do this?

03:23.000 --> 03:29.000
And finally, and there's the main focus of the talk is documentation, which actually

03:29.000 --> 03:35.000
must follow the same structure. It must describe it. It must be always up to date. And in particular,

03:35.000 --> 03:41.000
all the code snippets, this code snippets, they can be, like, copy, then paste it into

03:42.000 --> 03:50.000
customers' workflows. And they must work without saying, oh, this argument is no longer valid

03:50.000 --> 03:59.000
or something like this. This transformation has been renamed. So, in the very beginning of

03:59.000 --> 04:07.000
the development of this product, a decision has been made. And I think we are still getting benefits

04:07.000 --> 04:15.000
from this decision, like, many years after, is that we are using DSL approach as a single source of

04:15.000 --> 04:22.000
truth. So, I'll throw part of the product is written in Kotlin, part of it is written in

04:22.000 --> 04:33.000
TypeScript. The software developer starts from editing DSL file, the demo file, from which large

04:33.000 --> 04:39.000
chunks of artifacts are being generated. So, I counted six in our particular case, like

04:39.000 --> 04:46.000
DTOs, this JSON schema, the projectional editor, which is actually auto-generated. We don't

04:46.000 --> 04:56.000
make our front-end engineers like to follow all the controls and switches. And the documentation is one of

04:57.000 --> 05:05.000
this, you know, six outputs. So, what did we choose as a DSL as the domain specific language for this?

05:05.000 --> 05:12.000
Surprisingly, in our case, it's open API spec. Well, as you can see, it defines all the

05:12.000 --> 05:18.000
type names, it defines the structure, like in properties, what parameters do we have, it defines

05:18.000 --> 05:26.000
descriptions, nullability, all the stuff, like lots of stuff, actually. So, if you ask why open

05:26.000 --> 05:31.000
API spec, why such is strange, probably choice, well, because it's just because it's just

05:31.000 --> 05:38.000
feed for the purpose. Because built-in documentation fields, it supports extensions, it supports

05:38.000 --> 05:44.000
non-standard extension via X fields. So, if there is something which is not in a standard

05:44.000 --> 05:51.000
open API, you just put a property anywhere, which starts with X dash. And you can put just any

05:51.000 --> 05:58.000
additional semantic value you want. And what's important, we have a very high quality

05:58.000 --> 06:07.000
widely used open API parser for the GVN for Java. And, well, I believe in our days, the success

06:08.000 --> 06:15.000
of the product is mostly from using open source, which fits best your purpose. And writing

06:15.000 --> 06:24.000
like some glue code. And if you are choosing your library carefully, the amount of glue code

06:24.000 --> 06:30.000
is going to be minimum. So, in our case, we already had co-generators for Kotlin and for

06:30.000 --> 06:35.000
TypeScript to produce our details. And the remaining step was to generate the documentation.

06:35.000 --> 06:43.000
So, we decided to generate ASKIDOktor from the same source, from the open API spec. And then

06:43.000 --> 06:48.000
rely on ASKIDOktor toolchain for everything that follows. So, actually, this yellow

06:48.000 --> 06:56.000
tiny yellow part needed to be implemented. And by the way, 2020-26, you just task your agent

06:56.000 --> 07:03.000
to write glue code. And it can be done really quickly. So, the full pipeline is like this.

07:03.000 --> 07:10.000
So, as I said, there is a number of branches, a number of artifacts that we are producing. But

07:10.000 --> 07:18.000
we have only a small amount of glue code, like from Swagger parser, like using

07:18.000 --> 07:26.000
Kotlin, Kotlin is just also very reliable library, which is a builder of Kotlin code. So, we

07:26.000 --> 07:32.000
just use this glue code and produce Kotlin details, JSON schemas that we need. And the document

07:32.000 --> 07:41.000
is the documentation. One thing, like, when we think of auto-generated documentation, we

07:41.000 --> 07:48.000
think of something like Python docs, strings or Java doc. And you know that they produce

07:48.000 --> 07:57.000
HTML right away from the code. But interestingly, if you want to do something yourself,

07:57.000 --> 08:03.000
what I recommend is to use some semantic markup, because it's much easier to produce than

08:03.000 --> 08:10.000
HTML. And also, it will merge finally. It will blend finally with the rest of your documentation.

08:10.000 --> 08:16.000
Because if the rest of your documentation is written in the same markup, you will get automated

08:16.000 --> 08:23.000
documentation. And, like, written by humans, the documentation written by humans, in the same

08:23.000 --> 08:31.000
style, with the same visual style. So, which is important. So, speaking about DSL, DSL is a single source

08:31.000 --> 08:37.000
of truth. First of all, it's a design decision, a project design decision. It's not a decision

08:37.000 --> 08:44.000
that is taken by technical writers. So, some other people, like, it must be taking it very

08:44.000 --> 08:51.000
beginning. And it helps keep multiple parts of the product in sync, including, but not limited

08:52.000 --> 08:59.000
to the documentation. It's very powerful approach. The choice of DSL, however, is context dependent.

08:59.000 --> 09:06.000
So, in our case, it's open IP, open API spec. In your case, it can be anything which fits the purpose.

09:06.000 --> 09:12.000
You have lots of options, like a fully custom DSL. If you feel like writing your own custom

09:13.000 --> 09:20.000
parser of something. An existing spec language, such as XSD, open API, like whatever spec language you

09:20.000 --> 09:28.000
feel fit. It can be YAML XML-based format, or it can be an internal DSL in a host language, such as

09:28.000 --> 09:35.000
Groovy or Rubio Kotlin, which produces this, you know, syntax abilities to expose an internal DSL.

09:36.000 --> 09:42.000
The best results usually come from choosing a technology with strong out-of-the-box tooling. So, you

09:42.000 --> 09:51.000
only need a small amount of custom blueprint. So, let's get to ask a doctor part of the presentation.

09:51.000 --> 10:00.000
And there's a question, like, why not ask a doctor? Why not something else? Well, here I'm a bit opinionated.

10:00.000 --> 10:07.000
For DSL, I said, like choose whatever you like. For ask a doctor, I'm sorry, Daniel.

10:07.000 --> 10:14.000
I'm opinionated. I'm from ask a doctor camp. I believe that it just outperforms all the other

10:14.000 --> 10:19.000
markup languages. It has richer semantics. It has really powerful tables out of the box.

10:19.000 --> 10:25.000
It has attributes and conditional content, which makes documentation dynamic. And it has countless

10:25.000 --> 10:32.000
diagrams, code integrations. And also, it's truly cross-platform. Because if you're like

10:32.000 --> 10:39.000
your project is in Ruby, in JavaScript or in JV, you're using your build tools and you

10:39.000 --> 10:46.000
completely get ask a doctor tooling there. That's quite a rare, like, quite a rare example

10:46.000 --> 10:52.000
of truly cross-platform tooling. And my favorite feature include about which I'm going to

10:52.000 --> 10:58.000
speak separately. Complex tables. This presentation is in us,

10:58.000 --> 11:05.000
the doctor, by the way. So there's a GitHub repo for it. You can have a look. So good luck.

11:05.000 --> 11:15.000
If you want to do these things in Markdown and escape like chopping onions.

11:16.000 --> 11:22.000
That's a basic thing, like, same with the many languages. Let's have a look at task

11:22.000 --> 11:27.000
a doctor example. This call out, by the way, is supported by a

11:27.000 --> 11:34.000
doctor. It's first class support, like no extensions, quite unique thing. And yeah, you

11:34.000 --> 11:41.000
can do it in various languages. So, as you probably guessed, here, you see a

11:41.000 --> 11:45.000
client like 2500 years old Delgaritum for fighting greatest common

11:45.000 --> 11:48.000
divisor, presented in various languages.

11:48.000 --> 11:56.000
Lua, prologue, Ruby, Java. And you may ask, like, what you make, like, feel,

11:56.000 --> 12:02.000
believe that these are actually valid code snippets that they are actually producing

12:02.000 --> 12:08.000
this idea. Well, I can assure you that all these snippets are tested before the whole

12:08.000 --> 12:15.000
presentation being built using property like strong property based tests and the

12:15.000 --> 12:20.000
besides integer overflows they are correct. How do I achieve this? I'm going to

12:20.000 --> 12:29.000
tell you in a minute. But before that, I just can't help showcasing the

12:29.000 --> 12:36.000
integrations that a doctor have, which actually allows you to keep

12:36.000 --> 12:43.000
colorful, powerful, powerful pictures, illustrations within your documentation,

12:43.000 --> 12:50.000
in plain text. Alfaindomic is graph is quite low level, but you can do very flexible.

12:50.000 --> 12:56.000
You can do anything in it. If you want some formal stuff such as UML, you have

12:56.000 --> 13:03.000
planned UML for this. If you are teaching, like me, languages, you need this

13:03.000 --> 13:10.000
syntax diagrams and I'm really proud to show this slide because this tool in particular

13:10.000 --> 13:17.000
was implemented by my third year Java students. And due to, like, amazing

13:17.000 --> 13:23.000
community, an aski doctor, they included it as the first class support in an aski

13:23.000 --> 13:27.000
doctor. So if you want to describe some syntax, you have JSON

13:27.000 --> 13:33.000
tricks for this, and it's supported by aski doctor.

13:33.000 --> 13:41.000
It's in universal itself, right? So you stack charts like bar charts, you need

13:41.000 --> 13:45.000
them and your documentation, not technical documents. You don't need to copy and paste

13:45.000 --> 13:51.000
pictures. It can be a hidden. And of course, it's not like that

13:51.000 --> 13:56.000
frequent that you need a formula and your technical documentation. But if you do need

13:56.000 --> 14:00.000
them, of course, there is only one format for the formula, like if you're taking

14:00.000 --> 14:06.000
formula seriously, that's latex and aski doctor is capable for this out of the

14:06.000 --> 14:14.000
box. So it's extremely powerful, but my favorite feature is still

14:14.000 --> 14:23.000
included. Yeah, other tool chains they're missing it. So what it does, well,

14:23.000 --> 14:28.000
a simple thing, right? So somewhere in my code base, I have this Java file and I

14:28.000 --> 14:32.000
just include it and it's just being included here on this slide. But Java is

14:32.000 --> 14:38.000
verbose and actually you don't see anything useful on this slide because of the

14:38.000 --> 14:44.000
header like all this stuff. But as you can see, if I put special

14:44.000 --> 14:51.000
comment lines here, I'm tagging a snippet and then later I can refer to this snippet.

14:51.000 --> 14:57.000
So actually what I see here is not like a meaningless JavaScript snippet,

14:57.000 --> 15:03.000
copy and paste it from somewhere. It's like a tiny window where from which you

15:03.000 --> 15:09.000
see just a snippet of the whole big Java file. And then I'm put this on the

15:09.000 --> 15:13.000
slide for my students and I say that this is the solution of a classical

15:13.000 --> 15:17.000
problem of word counting. You have a text file and you count words and you

15:17.000 --> 15:22.000
have a map from string to long blah blah blah. And the question is, how do we

15:22.000 --> 15:27.000
make sure that this claim is accurate? We need a test for this. Okay, for

15:27.000 --> 15:31.000
Java string API, I'm probably straight forward, but if it's some other tool

15:32.000 --> 15:37.000
that we are documenting. We need a test. So what do we do? Remember that this

15:37.000 --> 15:41.000
is actually not a snippet. This is actually a window for the big file. So this

15:41.000 --> 15:46.000
file can be unit tested. So I'm writing a unit test for this. Oh, sorry,

15:46.000 --> 15:51.000
Java is to verbose. Let me focus on the interesting part. And yeah,

15:51.000 --> 15:55.000
this is a unit test which actually calls the method in question and

15:55.000 --> 16:02.000
verifies that it's compiled that it's run. And here we are. If you keep the

16:02.000 --> 16:07.000
commutation and tested example code in the same repository, if you use

16:07.000 --> 16:12.000
include with tags to pull in only relevant chunks or only relevant fragments,

16:12.000 --> 16:18.000
you build docs and run the example code tests in the same CI pipeline. And

16:18.000 --> 16:23.000
this includes using same build tool. This is why it's important like for that

16:23.000 --> 16:29.000
we have ASCII doctor, right? We can run ASCII doctor in NNPM. We can run ASCII doctor in

16:29.000 --> 16:35.000
Ruby, build tool. We can run ASCII doctor in grade all the way. And it's either

16:35.000 --> 16:40.000
green or red. If some of the tests are red, then the test breaks the

16:40.000 --> 16:47.000
CI fails. You have no documentation. Final part is

16:47.000 --> 16:57.000
Antora. Why Antora? Why DSL? In DSL, choose whatever you like.

16:57.000 --> 17:06.000
This is my point. For semantic markup, ASCII doctor is the best. That's my opinion.

17:06.000 --> 17:12.000
Antora, well, frankly speaking, you don't have any other choice if you are using ASCII

17:12.000 --> 17:19.000
doctor. That's the truth. What's good about Antora? It has version

17:19.000 --> 17:26.000
doc as a first class concept. So if you are doing release cycle for your library or

17:26.000 --> 17:32.000
your product, you can do tags in your git repository or branches and Antora

17:32.000 --> 17:39.000
will refure and it will publish them. Next, it has an opinionated layout for

17:39.000 --> 17:45.000
models, assets, examples and stuff. You put some example here. You know where to put

17:45.000 --> 17:51.000
image, where you put your SQL file, where to put your YAML, file example, whatever.

17:51.000 --> 18:02.000
Antora has standard folders for this. It has specific format for cross-references.

18:02.000 --> 18:12.000
So it has this notion of modules, examples, assets and stuff. Which helps a bit when you

18:12.000 --> 18:20.000
want to move stuff around your documentation. So when you're not relying on particular

18:20.000 --> 18:28.000
place on the file system, but your cross-reference is put in terms of modules, then you

18:28.000 --> 18:35.000
can move your ASCII doc files from a module to a module and your cross-reference will

18:35.000 --> 18:41.000
be working if you are careful enough. So that's really powerful feature for maintaining

18:41.000 --> 18:49.000
of large documentation. Last but not the list, it's surprisingly fast. Like those of you

18:49.000 --> 18:55.000
who know how long I can build cycles with great Laurenti and they can take ages.

18:55.000 --> 19:04.000
Antora builds huge size in a matter of seconds, and this is really impressive like in this

19:04.000 --> 19:17.000
days. However, and yeah, during previous presentations today, we had a lot of cases for

19:17.000 --> 19:26.000
you know, not documents as code solutions. Like WIKI or things like what was it called,

19:26.000 --> 19:34.000
last year, right? So this presentation brings like another kind of end of the spectrum.

19:34.000 --> 19:44.000
Like everything is code. Everything is tested. Everything is on CI. Well, actually from my current

19:44.000 --> 19:53.000
experience, I think, and this is still not solved problem, we must be, we must support something

19:53.000 --> 20:00.000
in between. Because when you've got like large and Torah presentation, large and Torah

20:00.000 --> 20:05.000
documentation, everything on CI, it becomes really difficult for people like solution architects

20:05.000 --> 20:13.000
or marketing people to edit it, to move contents around, to add some descriptive chunks.

20:13.000 --> 20:20.000
Also, it becomes to burden some to sync it with the list cycles. Like I want to update it now,

20:20.000 --> 20:28.000
or no, we need to wait for the next list. So probably if your documentation is mostly descriptive,

20:28.000 --> 20:37.000
is decoupled from a list cycle, probably you should think about something else like WIKI or other tools

20:38.000 --> 20:44.000
that we discussed earlier. But if you want to keep it tightly in sync with the code and with the

20:44.000 --> 20:52.000
features of your product, then this approach is definitely the best. So to conclusions,

20:52.000 --> 20:58.000
keep the documentation close to the source of truth. Ideally, if you're using the same

20:59.000 --> 21:04.000
repository on GitHub, with everything like with documentation and with code and tests,

21:04.000 --> 21:12.000
generate documents as plain ad hoc and let us give doctor to do the rest. It's really easy

21:12.000 --> 21:19.000
to co-generate ad hoc. Use the include magic in order to include testable examples and make

21:19.000 --> 21:27.000
your all the examples of your code, be it SQL, Yamu, Java, testable and tested. And yeah,

21:27.000 --> 21:33.000
the part of writing the documents and the part of engineering the whole solution is actually

21:33.000 --> 21:39.000
the same process. So when they're blunt, you get the best results. So thank you for listening.

21:39.000 --> 21:46.000
You can see the source code of this presentation here with all the examples and all the code

21:47.000 --> 21:53.000
examples in this presentation are really tested. Thank you.

22:08.000 --> 22:14.000
WIKI about WIKI, like, yeah, first of all, one hour ago there was a presentation

22:14.000 --> 22:21.000
like which demonstrated that actually ask doctor like provides you with the ability to write

22:21.000 --> 22:28.000
your own templates for output templates. So yes, it can produce HTML and can produce PDF,

22:28.000 --> 22:34.000
like there are templates for open document formats. And I didn't heard about, like, you know,

22:34.000 --> 22:40.000
WIKI pages, but I think it shouldn't be that difficult to write your own template because it's

22:40.000 --> 22:47.000
like, namely from one semantic markup to another semantic markup. Yes, please.

22:47.000 --> 22:53.000
How are you familiar with the concept of the programming?

22:53.000 --> 23:01.000
Could something definitely, but yeah, I'm not familiar. I heard the, you know, the question was,

23:02.000 --> 23:16.000
am I familiar with literate programming? I feel the name, but I don't know what it is.

23:16.000 --> 23:24.000
Orgmode syntax? No, no, no, no, no. Let's discuss.

23:24.000 --> 23:28.000
question of the day. Questions? Yes, please.

23:39.000 --> 23:43.000
You have terraform and you want it to be beautifully documented.

23:43.000 --> 23:47.000
100% it does work. This is the idea.

23:47.000 --> 23:51.000
If you have something deep, but you need to find a

23:52.000 --> 23:55.000
uniform, you know, documents.

23:55.000 --> 23:58.000
But yeah, terraform has additional attributes and

23:58.000 --> 24:03.000
yeah, I see like, no, no problems in implementing this.

24:06.000 --> 24:08.000
Yeah, yeah.

24:10.000 --> 24:12.000
Infrastructure risk.

24:17.000 --> 24:19.000
Not like what?

24:21.000 --> 24:28.000
If the question is, if the DSL is not open

24:28.000 --> 24:32.000
API spec, but some infrastructure is code DSL such as

24:32.000 --> 24:36.000
Ansible, terraform and stuff.

24:36.000 --> 24:40.000
Yes, from what I know about this DSLs,

24:40.000 --> 24:45.000
I would be capable of writing something which

24:45.000 --> 24:50.000
will produce semantic markup for documenting this.

24:50.000 --> 24:53.000
That's definitely possible.

24:53.000 --> 24:57.000
Thank you very much for your help.

24:57.000 --> 24:58.000
Thank you.

