WEBVTT

00:00.000 --> 00:18.000
Okay. Hello, everybody. I'm Juanillo Lott. I'm a software engineer at N0. I'm also part

00:18.000 --> 00:27.400
of the open-to-full core team members and who want to show up hands who is using infrastructure

00:27.400 --> 00:34.800
as code today. Nice. Who is considering using infrastructure as code maybe didn't start

00:34.800 --> 00:45.800
yet? Okay. Great. So, today we'll talk about infrastructure as code and why we need it.

00:45.800 --> 00:52.360
We will choose how to choose the right infrastructure as code for our needs, how our organization

00:52.360 --> 00:59.240
maturity should affect the decision and will discuss the unified DevOps process, meaning

00:59.240 --> 01:07.480
how we integrate our infrastructure as code into our CICD pipeline. So, I'll start with

01:07.480 --> 01:16.960
a story 12 years ago. I nailed my first software engineering job in an on-prem organization.

01:16.960 --> 01:22.320
Everything was completely air-gapped and our architecture was pretty simple and look like

01:22.320 --> 01:29.240
what you see on the left. We had bare metal servers with virtualization layer and on top

01:29.240 --> 01:36.560
with some back-and-up locations, all behind one load balancer, one centralized database and a few

01:36.560 --> 01:45.280
cues maybe. And one day I went to the office and I opened my web application that I was working

01:45.280 --> 01:52.720
on but I kept getting five or two status, that gateway and I called the IT guy and after a

01:52.720 --> 01:58.880
few minutes it told me, I'm sorry we can't find the physical server that your application

01:58.880 --> 02:04.120
was hosted upon. So, it ended up with me going down to the data center with this person

02:04.120 --> 02:14.760
just searching for the missing server which we never found. Yeah. Lovely. So, that's why scaling

02:14.760 --> 02:24.840
was really hard. My organization had to keep maintain our hardware hardware and physical

02:24.840 --> 02:30.440
servers but then cloud computing entered our life and now architecture looks more like

02:30.440 --> 02:39.400
what you see on the left side, left side. And it wasn't just cloud computing, thermal

02:39.640 --> 02:45.480
factors to wire architecture became more and more complex. If it's the shift from water

02:45.480 --> 02:52.200
full to agile we keep developing in more iterative cycles so we keep adding new features. If

02:52.200 --> 02:59.160
it's the DevOps approach and now we're trying to automate everything and of course, monolith

02:59.160 --> 03:04.120
the microservices, the shift in this approach and sure that our infrastructure was broken into

03:04.200 --> 03:10.360
multiple part, each microservice has its own infrastructure. From a single database we have multiple

03:10.360 --> 03:21.560
fragmented databases so everything became more and more complex. And at the beginning of the cloud

03:21.560 --> 03:29.800
computing era, click ups was enough but then as things get getting more and more complex,

03:29.800 --> 03:33.320
infrastructure's code was introduced and the idea behind infrastructure's code is that

03:35.320 --> 03:41.640
practices that are good for developers are good for DevOps people, if it we have one document

03:41.640 --> 03:47.080
it's in the source of truth. Now it's really easy to understand how our infrastructure should

03:47.080 --> 03:51.960
look like. It's really easy to promote it between different environments if it development integration

03:51.960 --> 03:59.640
production and we also have we can automate it and we have version control and we know who did what

03:59.720 --> 04:08.120
and why. Hopefully if they really wrote the correct commit message. Let's take a quick look at

04:08.120 --> 04:13.800
the infrastructure's code timeline. This is a very partial timeline there are so many different

04:13.800 --> 04:18.760
tools out there and the beginning of infrastructure's code we had very simple scripting if it was

04:19.640 --> 04:26.920
bash and python but then the first infrastructure's code tool came to exist and they were more

04:27.000 --> 04:34.440
dedicated and focused on managing existing configuration but then and it's like for example

04:34.440 --> 04:40.920
public puppets and chef existing resources and then in 2010 a cloud formation entered our life

04:40.920 --> 04:47.880
and changed and shifted the balance and now some infrastructure's code tool are more focused

04:47.880 --> 04:55.000
on creating and provisioning new resources. In ever since we have lots and lots of tools that we

04:55.080 --> 05:04.760
will cover in a minute. Awesome so we need to choose the right tool right we have so many considerations

05:05.720 --> 05:12.120
just a few of them. It's security and compliance if we accidentally misconfigure our

05:12.120 --> 05:17.640
infrastructure's code tool the server like our internal application can be exposed to the

05:17.640 --> 05:22.440
internet people can maybe get their hands on information that we don't want them to.

05:23.400 --> 05:28.520
If it's state management some infrastructure is good tool have to manage some kind of a state

05:28.520 --> 05:36.360
to keep track of all the resources all our architecture and we need to understand how they manage

05:36.360 --> 05:43.480
that sometimes we need to manage that sometimes it contains sensitive secrets and scale and

05:43.480 --> 05:50.600
performance I don't need to tell you how knowing it is to wait for something to deploy for a long

05:50.600 --> 05:57.160
or especially if we have a production issue and we need to fix something right here right now

05:57.160 --> 06:02.600
and of course learning curve and adoption is a very important consideration if it's to complex

06:02.600 --> 06:09.800
people want adopted and we need to have proper documentation community support tooling and

06:09.800 --> 06:18.600
ecosystem to ensure quick adoption. So this is where you kind of imagine mortal combat music

06:19.480 --> 06:26.840
and we will discuss and we will cover six tools we have an overwhelming amount of tools here

06:26.840 --> 06:35.160
I personally use most of them but there are so many option outs out there and let's talk about six

06:36.120 --> 06:42.920
of them so we have terraform it's really an industry standard through initial of hands

06:42.920 --> 06:52.200
who is using terraform in here yeah very expected it's cloud agnostic but not only cloud agnostic

06:52.200 --> 07:01.400
you have providers for everything for SaaS solutions for example off zero my company and zero

07:01.400 --> 07:10.040
also have a terraform provider also Kubernetes it can even order pizza with a terraform provider

07:11.000 --> 07:17.480
it has a very large community and ecosystem and lots of tools around that and and best practices

07:17.480 --> 07:23.480
and documentation encourage you to build really modular and reusable infrastructure you have

07:23.480 --> 07:29.240
lots of models online that you can already use I put it in the grave because I think there is no

07:29.240 --> 07:37.480
consensus around that some people love the domain specific language the terraform users

07:37.480 --> 07:46.360
ACL some people really don't like people usually who don't come from programming backgrounds

07:47.240 --> 07:51.720
tend to love it because it's declarative and really really easy but programmers don't like it

07:51.720 --> 07:57.400
because it takes some of their freedom personally as a programmer who shifted into platform

07:57.400 --> 08:03.480
engineer I I think it's very very like useful and very easy to use

08:04.360 --> 08:08.760
preference and scalability I put it in grave because it's really not bad compared to other tools

08:08.760 --> 08:15.560
but when your state become really large and your infrastructure become quite complex it can be

08:15.560 --> 08:24.200
kind of slow but you need to manage your own state file and which means like you really have to

08:24.200 --> 08:31.640
understand where you want to host it it's not encrypted everything is in plain take text so you need to

08:31.720 --> 08:39.480
also understand how you encrypt it and it used to be open source but change to be a cell in

08:39.480 --> 08:48.680
last August 2020 you know August 2020 for you already yeah and the next tool is called

08:48.680 --> 08:55.400
formation maintain by AWS so it means that the vendor is the one responsible for the documentation

08:55.400 --> 09:02.680
best practices which is really nice state is managed behind a scene can be a pro and a con

09:02.680 --> 09:12.360
I personally never had a problem with the state so I I like it AWS added new features recently

09:12.360 --> 09:19.240
built in drift detection get ups flows and when it comes to syntax we really have like a

09:19.240 --> 09:25.720
YAMO usually it's a YAMO configuration also a declarative one which I personally find less

09:27.080 --> 09:34.680
simple than terraform let's friendly less friendly to use and the bad stuff it only support AWS

09:35.560 --> 09:41.960
cloud yeah you can deploy things to like other providers you can't order pizza with that

09:41.960 --> 09:49.960
and the language is less friendly when trying to create modeling infrastructure from my from my experience

09:51.160 --> 09:59.720
next tool will you me it's quite an interesting one because you use your use like

09:59.720 --> 10:05.960
familiar languages type great Python go to write and describe your infrastructure code

10:06.280 --> 10:13.400
very when comes to provider it's very very similar to terraform it's cloud agnostic you have multiple

10:13.400 --> 10:19.320
terraform multiple providers also some of them based on terraform some of them are based on terraform

10:19.320 --> 10:25.000
providers there are multiple ways to manage your state they can manage it automatically for you and

10:25.000 --> 10:31.560
then they also encrypted or you can manually manage it on your own because it uses code and functions

10:31.640 --> 10:40.280
it's really easy to create modularization and in here it's again it's in gray because the language

10:40.280 --> 10:47.080
familiarity can be very good for developers but devops people can really struggle with learning new

10:47.080 --> 10:58.360
programming language it's a last popular tool community size and support is smaller and performers

10:58.440 --> 11:09.480
also on big deployments can suffer and like my take is that I like guardrails when I have limitations

11:09.480 --> 11:16.600
then I feel safer I feel like I can't over complex my architecture and one of me kind of gives me

11:16.600 --> 11:25.800
in my opinion too much freedom that I don't always want and we will be comparing that to open

11:25.800 --> 11:34.760
tofu as I said before but again you diligence part of the open tofu core team for the last

11:35.800 --> 11:43.640
year and a little um open tofu is a drop in replacements for terraform after the change of the license

11:43.640 --> 11:49.640
of fashie corp it's truly open sorry it's backed by the links foundation that's white will always

11:49.720 --> 11:57.720
remain open source because it's like it's a fork of terraform so it inherited all the terraform

11:57.720 --> 12:05.960
the extensive terraform at the system um we have like one of the first features we implemented in

12:05.960 --> 12:13.480
one dot seven was taken encryption to ensure that you that you can encrypt your secrets in your state

12:13.560 --> 12:21.640
at rest um so they won't be in plain text and as mentioned about terraform we farm

12:21.640 --> 12:26.760
it's a domain specific the clarity of language performs a scalability it's pretty much the same

12:26.760 --> 12:32.920
although we're trying with the community to improve that and obviously it's why it's a new tool

12:32.920 --> 12:40.040
there is a relatively small but growing community and I think it still needs to prove itself

12:40.840 --> 12:48.520
um our last comparison Ansible so this one is a little different because as we said

12:48.520 --> 12:55.080
we have uh infrastructures go to that provision resources but Ansible is dedicated to

12:55.080 --> 13:03.080
configuring two configuration management um built in Python but configuration language is in

13:03.080 --> 13:11.880
yamo uh you write procedural playbooks declaring what to do in each step it's agent less so you

13:11.880 --> 13:19.400
don't have to now handle installation and updates of agents although you have to um enable

13:19.400 --> 13:28.360
as a sage um in order for your service to communicate um and it's an open source tool with a

13:28.360 --> 13:37.640
very large community and um thousands of Ansible Galaxy roles that you can use online um but it

13:37.640 --> 13:45.560
has no state management so it runs everything every single time um because it can compare it to an

13:45.560 --> 13:57.480
existing state let's compare it to sold project it is a similar solution um it uses server

13:57.480 --> 14:06.120
client configuration um that can require agents but not always um on the other so we you don't have

14:06.120 --> 14:15.480
to enable SSH and it's quite fast um it's faster than Ansible from what we've used and it

14:15.480 --> 14:22.600
uses state for configuration management um again but usually you'll have agents so you need to maintain

14:23.560 --> 14:32.760
and and ensure updates are going smoothly um it was also acquired by uh VMware and Brotscom so

14:32.760 --> 14:40.360
it's fully open source now but uh let's see what will happen i hope it will stay it with the

14:40.360 --> 14:47.560
way it is uh so it's really after we talked about those tools it's really really tempting uh to ask which

14:47.640 --> 14:57.800
tool is the best and as always the answer is it really depends there is not really a best tool for

14:58.520 --> 15:04.360
all scenarios as we said there are lots of questions that you need to to understand based on your

15:04.360 --> 15:10.920
organization what platforms are you using what cloud are you using multiple clouds uh who will

15:10.920 --> 15:16.520
mean maintain the infrastructure scope if it's developers if it's uh maybe the DevOps team and what

15:16.520 --> 15:25.800
technologies they're familiar to and we should always consider mixing and matching tools um yeah so

15:25.800 --> 15:32.040
we should always mix and match uh consider mixing and matching tools and create solutions that are

15:32.040 --> 15:39.720
more tailored for our needs um so as we said some tools have different focus for visioning

15:39.720 --> 15:45.480
new resources managing existing ones we can mix and match them taro form an open tofu has an

15:45.560 --> 15:54.840
emcible provider which is cool um and in all the companies I've ever worked in uh they were always

15:54.840 --> 16:02.920
multiple uh tools that were used sometimes like for example infrastructures code that um was maintained

16:02.920 --> 16:10.520
by developers um was written in plume for example and maybe the more uh general infrastructure

16:10.600 --> 16:17.400
is called for the whole organization that was maintained by the DevOps team was written in uh terraform

16:20.440 --> 16:28.760
awesome so we need to take a look at our organizations uh maturity to understand how we use

16:28.760 --> 16:35.240
our infrastructures code tool really depends when we start usually and the there gets an organization

16:35.320 --> 16:42.360
is really really small uh we probably start with click-ups um but then when it doesn't cut in any more

16:43.720 --> 16:50.040
it's ideal time to really start using some kind of an infrastructure scope but with manual

16:50.040 --> 17:01.160
deployments on our local computers um but there is like sometimes something there there come the time

17:01.800 --> 17:08.760
multiple people start working on that infrastructure is code uh maybe we want multiple people to

17:08.760 --> 17:14.040
deploy our infrastructures code but you don't want them to have access to uh sensitive secrets

17:14.040 --> 17:20.520
so that's when we we just start thinking about uh automation and maybe uh taking the infrastructure

17:20.520 --> 17:28.680
is code uh tools and automating their deployments as part of our CICD uh and the last stage I think

17:28.760 --> 17:34.280
that's where your infrastructure becomes a bottleneck in your organization that's when the time

17:34.280 --> 17:43.400
to consider a more extensive platform if it's in-house you can also buy an existing tool um to receive

17:43.400 --> 17:53.160
better governance and self-service um capabilities so we are focusing on stage two when we want to

17:53.160 --> 18:01.720
start creating automation around our infrastructures code so let's see how our safety pipeline will

18:01.720 --> 18:08.920
look like so we first write code we push it we're probably open a poor request so other people can

18:10.520 --> 18:17.640
can see what we're about to change they can validate that this changes are uh good and then we

18:18.040 --> 18:26.360
uh then we have CICD steps so we can use LinkedIn testing and we will discuss uh a few tools here

18:26.360 --> 18:33.000
if it's infrequest and check off on how we make our CICD pipeline with an infrastructure

18:33.000 --> 18:41.880
code even even better um then after that we can use a approval policies to to validate that

18:42.840 --> 18:51.720
our plan what the thing that is about to change is okay is um following policies of our organization

18:51.720 --> 18:59.960
and only when this is fine the automatic um approval flow uh succeeded and as somebody

18:59.960 --> 19:10.440
approves our poor request we can apply and keep monitoring our infrastructure so um let's we'll

19:10.520 --> 19:17.480
briefly talk about um some tools as I said that can help our infrastructures code uh CICD

19:17.480 --> 19:24.840
and come even better if it's check off it's a study code analysis tool um that scans our

19:24.840 --> 19:33.880
infrastructure is good for misconfiguration security vulnerabilities um and we can write our own policies

19:34.120 --> 19:40.600
or use uh extensive policies and intro for example that our internal our institute is not

19:40.600 --> 19:50.360
exposed to the whole world um and another very cool tool is infrequest um we can use it to

19:50.360 --> 19:56.680
actually know if the thing we are about to deploy how much it will cost us obviously it's

19:56.680 --> 20:03.560
can't anticipate usage costs but it can finally if we're about to for example it actually

20:03.560 --> 20:13.960
happened in my company uh somebody accidentally deployed a very um pricey radius um instance and

20:13.960 --> 20:21.880
if we like we could have used that but thank you and we would have picked right up on that

20:22.760 --> 20:31.720
and see how much it's about to cost us um i think it's it's so we can integrate like for example

20:31.720 --> 20:41.800
we have OPA OPA policy agent and it's a policy is code agent and we can write our own policies

20:41.800 --> 20:47.800
and now we don't we can reduce bottleneck of human approval because they will be automatically

20:47.800 --> 20:54.840
checked and decide if we um if this policies if if this code is fine if the sting is okay or not

20:54.840 --> 21:00.680
um and it's really nice because you can actually integrate OPA with infrecost and maybe create

21:00.680 --> 21:08.520
some kind of policies um and budgeting around um if people are about to go over budget from

21:08.520 --> 21:17.640
something like that okay so in summary um we've discussed different types of

21:17.640 --> 21:24.760
infrastructures code tools we said there is no best uh tool for all the scenarios and it really

21:24.760 --> 21:31.480
depends on our organization maturity level the technology stackers we're using our use cases and

21:31.480 --> 21:38.040
requirements um and we should always adopt a strategic and adaptable approach when choosing our

21:38.120 --> 21:44.840
infrastructures go to thank you do you have any questions

22:02.040 --> 22:07.800
honestly yeah i should repeat the question so um the gentleman asked me

22:08.040 --> 22:14.600
why didn't i management mentions cross plane and if i've ever worked with it honestly i never did

22:14.600 --> 22:21.240
but i think it's a very very cool tool um lots of people and other DevOps that i'm working with

22:21.240 --> 22:29.880
are using it to um to simplify the um their infrastructure is code for developers for example my husband

22:30.680 --> 22:37.400
he's like he likes to use terraform an open tofu but um when he wants to give access to developers

22:37.480 --> 22:46.120
in an organization um he started creating a very simplified approach using um cross plane so it's a

22:46.120 --> 22:54.520
very cool tool but still have unused that next question yes yes so uh you get a free

22:54.840 --> 23:16.360
here so yeah okay yeah so he he asked me why i i i i i i i am contributing to open tofu

23:16.360 --> 23:22.760
why decided to do that so m0 is one of the founders of open tofu it's one of the backing

23:23.080 --> 23:30.040
companies or had was the one who started the rco the one who started the manifesto uh when

23:30.040 --> 23:35.000
hashicorp changed the license that's really the thing that got everything rolling and open

23:35.000 --> 23:42.200
tofu to its creation and honestly i just saw an amazing opportunity i've been using terraform for

23:42.200 --> 23:48.120
years and i love this tool and i just came to my boss and i'm like can i please please work on that

23:48.200 --> 23:56.120
i know that we need to uh to to give like to to um donate some developers to work on that and i asked

23:56.120 --> 24:02.760
and they let me do that and this it was like the last year was really amazing first time i've

24:02.760 --> 24:09.960
been working in an open source getting in touch with the community creating all sorts of relationships

24:09.960 --> 24:16.280
uh if it's with uh jet brains because one is support for open tofu looking at the viscode extension

24:17.160 --> 24:27.320
it was really really interesting next question yes yes

24:27.320 --> 24:34.360
as a class manager we see uh open tofu for terraform as uh diverging off which is going to be happening

24:34.920 --> 24:40.280
how do we what's your take on maintaining an environment that supports both these use cases

24:40.440 --> 24:51.080
yeah so a so um i was asked how the um if that's what i want to touch if terraform

24:51.080 --> 24:57.960
an open tofu are a divergence and how to really support like an ecosystem around the two of them

24:57.960 --> 25:07.640
right is that correct okay so as an open tofu maintainer we're really trying not not to

25:07.720 --> 25:14.600
diverge from terraform like when it comes to providers and models uh it is the same

25:14.600 --> 25:21.720
ecosystem was still support all the models and providers of terraform and we uh we want to keep doing

25:21.720 --> 25:28.360
that like that's that's the big advantage of open tofu because we have this amazing ecosystem

25:28.360 --> 25:36.440
and we understand that and we're not taking divergence uh in any um like likely

25:38.600 --> 25:43.880
so usually what we're trying to do we're trying to create new features but also

25:44.840 --> 25:51.640
support some features that terraform also released to ensure there is a very good compatibility

25:51.640 --> 25:58.600
between terraform and open tofu and that a users can switch back and forth pretty easily

26:00.520 --> 26:04.920
but how can you but that's a good question what if you want to like if you have a giant

26:05.880 --> 26:10.360
and you want some of your infrastructure's code to be a reading open tofu is something to

26:10.360 --> 26:16.840
use terraform actually we created a new feature not that new like a feature on that is called

26:16.840 --> 26:22.040
it's called dottofu you can set some configuration as dottofu and then open tofu will be the

26:22.040 --> 26:31.800
one um who reads them and ignores the respective uh TFiles so there are ways to um and practices to do

26:31.880 --> 26:34.520
that I think it's written more in our in our website

26:37.240 --> 26:42.120
times up thank you so much