Building a Robust and Affordable Infrastructure With Free Services and Open Source Tools

You get billed only 4 cents a month.

Wait, what? Just 4 cents??

Yup, you heard that right.

Emphasis on getting billed only 4 cents though – the actual cost per month is a bit more when you include the electricity bill and we do that because we are about to self-host some services!

Preface§

Now, this blog post won’t be a step by step tutorial, but more of a high-level write up of things I would’ve liked to know when I went fully independent.

See, as an independent developer I wanted to have my own infrastructure because once it’s done I will be using that for the rest of my life, and so the question I wanted to answer was: what’s the cheapest way to build it? And unsurprisingly enough, self-hosting was a key ingredient in this recipe!

So, what’s the agenda?

We need to choose a tech stack.

That is we need to choose the host operating system as well as any other relevant tools we are going to use.

Determine the backup procedure.

Once we are done choosing the tools we are going to use in step 1, the data we manage within those tools and operating system will inform our backup procedures.

Handpicking external services.

Self-hosting everything nowadays is not ideal. Specifically you may want to outsource your mail service, off-site server monitoring, cloud storage and web servers that are critical to your business and should have a high uptime.

So without further ado, let’s get to it!

Choosing the technology stack§

The Host Operating System§

This could be the most arduous part of self-hosting since there’s a plethora of options out there. From Windows server, to Mac OSX server, to all the *BSDs (i.e. NetBSD, FreeBSD, TrueNAS, etc), plus every flavor of Linux out there, including the ones specifically designed for hosting such as Yunohost. The choices seem endless.

In the end, I opted to use Arch Linux. I use it as my daily driver and while some might argue there are several downsides to using it as a server operating system due to not having official SELinux support or configuration file formats changing dramatically from time to time or things simply breaking due to the rolling release nature of the distribution itself, there are equivalent arguments against it such as having official AppArmor which is somewhat equivalent to SELinux, and as for the stability of the user-land software you might use you – that’s what containers are for, and at this point if something breaks then it will most likely be a kernel-land issue which you can easily revert to an older version.

If you’ve never heard or used docker containers or simply containers, the easiest way I found to explain it is to just look at the root folder of your storage drive. If you are in Windows, open File Explorer, then click on This PC and open up C:/. On Mac, Linux, and *BSDs open up a terminal and type ls /. If you copy everything you see there into another folder, what do you get? Another environment you can run things from!

I know this is not the most precise definition there is. The Open Container Initiative has a much better definition right on its frontpage, and I could spend this whole article rambling about the magic underneath that makes it all happen such as the Linux namespaces and capabilities and virtual network devices and how Mac and Windows and BSD has similar tech that enables them to run containers, but for the sake of finishing this article have to move on.

Connecting the containers§

We have the operating system chosen, which is Arch Linux, now we need a way to manage the containers and connect them together, and for my purposes these two require two different tools: podman and podman-compose. Docker and docker-compose are far more popular but I prefer to use crun as the container runtime which is far more lightweight than Docker’s default runc plus podman has better integration with systemd from what I’ve found and also doesn’t need a service running all the time such as Docker. So for the rest of this article, I’ll be using podman.

With podman you can start/stop and view the logs of your containers – it is a container manager after all. And with podman-compose you can easily connect two containers together with a scoped Fully Qualified Domain name specified as a service within the compose file.

For example, imagine you have a blog using Wordpress which requires a webserver with PHP and a MySQL database. How do you connect them easily? Well, this is how!

 1version: '3.1'
 2
 3services:
 4
 5  wordpress:
 6    image: docker.io/library/wordpress:latest # this image provides everything you need except the MySQL database
 7    restart: always
 8    ports:
 9      - 8080:80
10    environment:
11      WORDPRESS_DB_HOST: db # this is the domain name for the MySQL container!
12      WORDPRESS_DB_USER: exampleuser
13      WORDPRESS_DB_PASSWORD: examplepass
14      WORDPRESS_DB_NAME: exampledb
15    volumes:
16      - wordpress:/var/www/html
17
18  db: # the MySQL container domain name is defined in this line!
19    image: docker.io/library/mysql:5.7
20    restart: always
21    environment:
22      MYSQL_DATABASE: exampledb
23      MYSQL_USER: exampleuser
24      MYSQL_PASSWORD: examplepass
25      MYSQL_RANDOM_ROOT_PASSWORD: '1'
26    volumes:
27      - mysql-db:/var/lib/mysql
28
29volumes:
30  wordpress:
31  mysql-db:

Save that as wordpress/container-compose.yml folder and run cd wordpress; podman-compose up and, assuming you’ve configured podman, you’ll have a wordpress instance running at http://localhost:8080 in no time!

But what now? That only gives us one web application on a machine. It should be possible to host multiple services within the same machine, right? Of course! That’s what reverse proxies are for! And I wish I knew this before!

See, the DNS servers provide the service that convert, say wordpress.localhost into an IP address (127.0.0.1 in this example) which is where a computer that connects to the service we are looking for is at. What I did not realize is that every time you enter an URL such as https//rubonnek.com and press Enter in your browser, the HTTP request header includes the Host entry that specifies what host the user is requesting (i.e. what service the user is requesting):

1GET / HTTP/1.1
2Host: rubonnek.com

And a reverse proxy can use that route the HTTP request to the correct service that’s either running in a container (our use case) or in another machine in the network. This is called Layer 7 routing, referring to the top-most layer in the OSI-model, and it can be very useful in certain scenarios.

So what application can do this easily with containers? Traefik! And Caddy! And possibly others, but I chose Traefik because it had everything I need. And what I found most impressive about it is the concept of middlewares in Traefik.

See, as I was configuring Traefik, I didn’t realize that what the reverse proxy is doing is basically managing the HTTP request flow and Traefik can inject services in the middle of that request flow. These middle-man services are called middlewares and one specially useful middleware is called Authelia which can be used to add multi-factor authentication to any web service you put behind Traefik!

Now that I got everything I wanted security-wise (for now), it was time to look for Git server with LFS and Kanban boards and a CI/CD server. For which I chose Forgejo and Woodpecker respectively.

All in all, this is a summary of the web services I’m self-hosting so far:

Traefik - reverse proxy for internal services
Forgejo - Git service which also includes Git LFS and Kanban boards which is all I need to version-control and keep tabs on my software development progress
Woodpecker - CI/CD server that connects to Forgejo which is where Woodpecker receives its events
Authelia - Used as a middleware for an extra layer of security
inadyn - to keep my authoritative DNS provider up to date since the IP for my personal services keeps changing so this is needed

And all of that in a Raspberry Pi 3 which, at the time I bought it, was about 100 dollars or so in total.

On top of that I bought another machine a mini PC with 8GB of RAM and for which I host:

Gitlab-Runner - I’ll get into this one later
Woodpecker Agent - which is where the CI/CD tasks actually run at
inadyn - to keep my authoritative DNS provider up to date just so I can connect into this box later

Setting up the backup procedure§

The backup procedure is straightforward. Since everything gets stored in podman volumes, all we need to backup is /var/lib/containers/storage/volumes.

I chose to do incremental encrypted backups using restic which is amazing.

Handpicking a cloud storage service§

Took me a while find out and my choice of choosing a lifetime plan for cloud storage is probably not a popular one so but I found pCloud to be a great service.

You can use rclone to mount the cloud filesystem locally or upload your files into pCloud.

It’s important to note that their cloud filesystem does not store certain attributes such as the executable bit, but its trivial to store those if you wrap your binaries around an archive format such as Zip or Tar.

Choosing a suitable web host provider§

This website is built with hugo using the hugo-bootstrap-theme by Razon Yang.

Static websites (also known as JAMStack websites) are very easy to host and generally very secure since there’s no server-side processing other than providing the HTML, JavaScript and CSS to the browsers. They are vulnerable to cross-site scripting if not coded properly but at that point the web server is still not affected, but the web client instead. Static sites are so easy to host that Gitlab provides Gitlab Pages for no additional cost! And Github also provides an equivalent service called Github Pages as well (but I chose Gitlab).

The comment system is supported by Staticman and is documented in the theme documentation. The contact form is supported by formailer. Both of which can be hosted in your favorite cloud function provider (i.e. AWS Lambda, Netlify, GitHub Apps, Google Cloud Function, etc).

Remember the monthly bill of 4 cents I talked about at the very beginning of this blog post? This is where it comes from for me. Since it’s need to create containers for the Staticman and formailer instances, mine live in the Google Cloud Storage and Google bills me for that. Speaking of, I chose to use the Google Cloud Functions because they process 2 million POST HTTP requests for free! You’ll need to use this fork of Staticman to make it work as a Google Cloud Function (version 2). Staticman is also the reason why I have a Gitlab-Runner in one of my servers since it runs a task to update this website every time someone posts a comment.

Finding a free Content Delivery Network (CDN)§

Since Gitlab Pages hosting only happens within the United States, loading this site from countries that are far away might be slower than I’d like so I spent a good while looking for a free Content Delivery Network and I found Cloudflare to be the only one that provided such service for free.

Selecting a mail service§

While there’s Cloudflare Email Routing which is free and only requires you to have purchased a domain name, I personally wanted to avoid using that.

I thought about self-hosting the email service as well but from what I’ve read so far, turns out it is not ideal to do so mostly because staying off of blacklists could become a full time effort. That is, your outbound email could be marked as spam if your email service is not used constantly or for some arbitrary reason or new policy of sorts which could be hard to control and keep an eye on.

If you plan to self-host, you could do it for inbound emails and but use a relay for outbounds emails to increase deliverability for example, but at that point you’ll be paying for an email service.

I chose the possibly unpopular option of buying a lifetime plan from MXRoute, and so far I’m very satisfied with their service – you can setup as many email addresses and many domains as you want with a single lifetime plan which is amazing.

Conclusion§

For brevity’s sake I’ve skipped over many details such as configuring Traefik, designing the network by setting up a DMZ and buying a managed switch to setup a type of VLAN to further tighten your security (such as an MTU VLAN), as well as setting up the A, TXT and MX records in your DNS provider to hook up the mail service, it’s also needed configure Cloudflare’s reverse proxy which is needed to enable their CDN, and setting up off-site server monitoring which I use hetrixtools.com) for. I also skipped over the electricity calculations, but with the right hardware those can be as low as $15 dollars a year or so.

Building your own infrastructure for your projects and setting up a website as well can be a massive endeavor, but it can be done for as little as $200 dollars and 4 cents a month (or a dollar or so more a month if you include the electricity bill) if you have the time. And in the end, if you plan to use it for the rest of your life (just like I am), it is worth it because then you’ll have the ability to slap any other service you need if there’s an Free and Open Source Software out there that meets your needs and you won’t have to depend on anyone else. For example, do you need video chat? Give Mirotalk a try! Want to provide remote IT support for your clients? Maybe Rustdesk is for you. See where I’m getting at? That’s freedom right at your fingertips. Unless you shoot out electricity out of yours (just like mine are right now).

For those wanting to start their own business, the possibilities of setting up an infrastructure for very cheap are endless. You just need to make sure to abide by their license, and make sure you can use it commercially and for your purposes.

And if I haven’t said it already, I need to say that Free and Open Source Software is amazing!