There’s many reasons for wanting to self host applications. As a user you might not like seeing your data sold to the highest bidder and see companies gathering gigantic amount of information about you. As a company you might worry about what’s gonna happen if the company providing a service goes bankrupt, get hacked, ….

The fact is, there’s a lot of very decent alternative to most of the application you’re probably already using. I’m writing this guide hoping it will help you getting started by sharing my experience considering:

  • I happen to run my own self hosted environment and the entire stack is running nicely.
  • They aren’t so many guides available and some of those guides do it wrong (this one for example gives you wrong information about SSL)
  • Those guides usually don’t tell you about the trade off involved in selecting a piece of tech. Disadvantages aren’t proudly market anywhere and you might learn about them the hard way.

Our moto for this guide is about helping you making the right decisions for your installation. Most of the time, it’s a tradeoff you have to choose from. Keep in mind, there’s always a cost to something that look like a great idea and we’ll try to expose those at the best of my knowledge.

General Approach to self hosting

Depending on your level, you might consider two different path:

  1. install an easy to use solution aiming to do self hosting easier such as:
  2. go with a fully custom solution that you build from scratch

If you’re not willing to learn about linux, server administration, you don’t have a choice, you need to go with option 1. Option 1 gives you ease of installation at the cost of being limited in term of which application you can install and how it is setup. To make it happen, go to the getting started page of any of the website listed above and you should be ready to go quickly.

Option 2 is what this post is all about. It’s definitely harder but I’ll be with you along the way so that you can install and customize pretty much everything you need without being limited by different app store

The Infrastructure

Server

It might sounds obvious but to self host your own set of applications, you need a server. That’s probably the first choice you’ll have to make throughout your journey:

where do you host the server(s)?

You have to choose between two options:

  1. Host it at home: The only advantage is you have a direct access to your machine. A few things about this:
    • real servers make real noises. Even if you think it’s cool because it feels pro, you don’t want this at home if you’re not living alone on an island, it can be very noisy. My advice is to keep it cheap and simple by using whichever hardware you already have, you’ll know when you need more, until then, relax and don’t invest much.
    • don’t think it’s free, you’ll still be paying for electricity and that won’t necessarily be cheaper than a cheap machine on the cloud
    • you’ll have to setup a bunch of things:
      • setup your router so that traffic coming from the internet will be manage by your server (that’s what we call port forwarding)
      • ask a fixed IP to your internet provider (if that’s not possible, you can look at dyndns but keep in mind those solutions are by design flaky).
    • pay attention to your internet connection. Outside LAN, if your connection isn’t great, you might end up with slow unresponsive applications. If you care about this, I would advice to go with the rental option
  2. Rent it from a hosting company. They’ll usually give you choice with a VPS or a dedicated server with various support level agreement. Depending on the number of apps you want to install, start small and get more when you need more. There’s no point in having a large machine if it stays idle all the time.

Tips: Weigh the pro and cons and make a decision.

Tips: If you go with option 2 I would recommend Digital Ocean. Everything has been so far cheap, nice and easy. On the plus side, you can also have 2 month free credit with their cheapest machine if you create an account with this link. I’ve also used Vultr and EC2 without problems. If you consider using OVH, for your own sake, run away. I’ve been using them for 5 years without an inch until recently where:

  • I needed to use their support which is unresponsive, takes forever to get something done and automatically close tickets even if the issue isn’t solve
  • they refused to provide me with an invoice in plain English for a new contract
  • they decided to sue me on the basis I needed to pay another year of a service I didn’t want to renew. If you decide to go ahead anyway, at least pay attention to all their newsletter bullshit as you’ll get problems if you don’t (like me)

OS

Choosing an OS is like choosing between white/dark and milk chocolate. There’s no absolute truth, it’s only a matter of taste. Anything could do the job but Linux will be our pick to follow this guide. Like for chocolate, your favorite milk chocolate could be Lindt or something else. In the Linux world, this is what we called “distribution”. In this guide, the instructions will be given for the Linux distribution called Ubuntu server.

Tips: Pick any Linux distribution you’re already familiar with. There’s no absolute obligation for this but try to pick a server distribution over a desktop one as there are a few differences that makes better for the task.

Tips: There’s a wide range of offering for container based orchestration. You might see a lot of buzzwords with CoreOS, Kubernetes, Swarm, … I will ignore those offerings as:

  • I don’t need scalability for self hosting. Those tools are design to solve problems I don’t have.
  • Those techs are bleeding edge
  • I don’t have a deep knowledge about those out of toy projects made over the week end

Get yourself a domain name

Essentially a domain name is a convenient way for your users to access a service (example: reddit.com). It’s not a must have but it’s definitely recommended. The most popular provider is Godaddy, I’ve used them a lot in the past and never got any problem.

Once you’ve got yourself with a domain name, if we consider you only have 1 machine and 1 IP address, I would recommend the following:

  1. Create an A record: yourdomain.com -> ip address of your server
  2. Create a CNAME record: www.yourdomain.com -> yourdomain.com
  3. Everytime you want to create a new service, generate the corresponding configuration: new CNAME record: applicationname.yourdomain.com -> yourdomain.com

The reason for doing this is to make your life easier. The day you change your server or something happen with your IP address, you’ll simply have to update your A record and call it a day.

If you really don’t want to invest in a domain name, you have 2 options to access your services:

  • you’ll have to type in your browser the ip address of your server followed by the port the application is running (eg: http:/24.34.12.234:1010). Good luck to explain this to a non tech savvy guy over the phone. It won’t be user friendly, slower (no gzip compression) and less secure (no web application firewall).
  • you can trick your OS dns resolution by editing your host file (/etc/hosts in linux and OSX). Example: if you add: “192.168.0.17 selfhosted.com” in your host file, typing selfhosted.com in your browser will bring you to 192.168.0.17. It will be better than the approach above but anybody who needs to access your applications will have to update their host file as well.

Tips: If you purchase a domain with this link, I will earn a few bucks which is a great way to show support if you like this post and want more in the future

Tips: TODO telnet game with DNS

Before getting deeper

Being able to connect to your server is an absolute requirement, for this, SSH is your friends. In this section, we’ll initiate a connection to the server that will get you a prompt from which you can execute commands.

If you chose to go with:

  • a cloud provider, wait until you receive the credentials to your newly created machine. Those credentials will be the one you can use to initiate an SSH connection to your server.
  • home hosting, plug a screen and a keyboard in your server and process with the installation of the operating system of your choosing. Once complete, you will need to install the so called SSH server by typing:
sudo apt-get update
sudo apt-get install openssh-server

To connect to your server, you will need something called an SSH client. If you are:

  • a Linux / OSX user: it is very likely the SSH client is already pre installed. fire up a terminal and type:
ssh username@host_ip
# replace username by your actual username, it might be 'root' by default
# replace host_ip by the ip adress of your machine
  • a windows user: give a try to putty, enter the ip address of your machine and connect

If everything is working fine, you should be welcome with a message and a prompt you can type commands from. For example, you can type in:

whoami
date

Setting up our server correctly

Note on security

When I first got my hands on an actual server, I thought nobody would even try to attack it as I had nothing of interest for anybody else than me. I was immensely wrong and if you feel the same, you need to understand Internet is a place where nasty bots constantly try to attack servers in order to own them. There’s nothing you can do except protecting yourself. If you don’t properly secure your machine, it’s a matter of minutes before you’ll get hacked, and your server will either or both:

  • be use to spam other people
  • participate in large attack that will read on the news as: “the xxx company was hacked by xxx”

You might also be tempted to install many things on your server right away. Keep in mind that the more tools you have the larger is your attack surface. Be very conservative about what software you want to install on your server.

First things first

On your newly created machine, you will need to do a few things, namely:

  1. Upgrade everything that came pre-install in your server:

    sudo apt-get update
    sudo apt-get upgrade
    
  2. If you’re using the root account, keep in mind that using it directly is considered for good reasons very bad practice. Don’t be a cow boy and make yourself another one:

    sudo adduser username
    sudo su username
    passwd
    
  3. Properly setup your SSH connection:

    • by blocking botnets from trying to breach your server using brute force attacks:
    sudo apt-get install fail2ban
    

    Fail2ban is a great tool that block weird behaviors such as somebody attempting to connect to your server after 10 password attempts. I don’t know any good reasons not to use it.

    • disable root access to your machine:
    sudo sed -i 's/^PermitRootLogin/\#PermitRootLogin/' /etc/ssh/sshd_config
    sudo service ssh restart
    

    I don’t know any good reasons for someone to connect directly with the root account. If you need root, connect as a normal user, then become root by typing:

    su root
    
    • Change the default port SSH is using (22) to something else like 2222:
    sudo sed -i 's/Port 22/Port 2222/' /etc/ssh/sshd_config
    sudo service ssh restart
    

    This is not a must have, simply a nice to have. Personally I don’t even do it all the times.

    • Securely connect to your server without always typing your password. From your desktop:
    TODO
    

From now on you have different choices:

  1. you stay with this basic setup and jump to the firewall configuration section bellow
  2. you want to increase security even more. You have different options available, you can even add them one on top of the other:
    1. Disable login with password. You can connect to an SSH server by using a private key. It’s convenient and you can also imprive security by disabling the usage of password such as only people with the private key can actually connect. Personally, I always use private key to connect as a convenience not to type password all the time but I will not disable the password as I wouldn’t want to lose be locked from my own machine, despite backups. If you feel confident doing, so move ahead and follow this tutorial.
    2. setup a VPN with openVPN. It will make things more secure but it adds an extra step when you want to connect your server: first connect to the VPN, then connect to your SSH server. I’ll show you a bit later on this tutorial how to setup a VPN. This solution has the same problem than the solution above plus if your VPN service is down, well, good luck.

Configuring your firewall

Ubuntu comes with iptables, the classic firewall most of the people in the linux community are using. If your infrastructure consist of one machine, iptables will likely be your only protection. That’s why it’s important to get this right. If you have many machines part of the same network, then you probably want to use a physical firewall, create a DMZ, setup honeypot and do a bunch of funny things we won’t cover in this post.

Tips: Some nice tooling was build around iptables, ufw is one of them. In practise, you probably don’t want to use it as I’ve seen some software (eg: docker) who edit their own iptables rules and those rules won’t be seen by ufw.

Warning: Don’t get locked out of your server while configuring your firewall. It already happenned to me multiple time, it’s very easy to do if you don’t double check and understand everything we’ll do here. To avoid this, I would recommend to follow this guide and make sure you understand the script before launching it.

We’ll first create our script and store it under /tmp/firewall.sh. copy and paste this large command:

cat > /tmp/firewall.sh <<EOF
#!/bin/bash

# Flush current rules
iptables -F
iptables -t nat -F
iptables -t mangle -F
iptables -X

# block everything by default
iptables -P INPUT DROP
iptables -P FORWARD DROP
iptables -P OUTPUT ACCEPT

# allow SSH: replace your SSH port if needed
# in
iptables -A INPUT -i eth0 -p tcp --dport 2222 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -p tcp --sport 2222 -m state --state ESTABLISHED -j ACCEPT
# out
iptables -A OUTPUT -o eth0 -p tcp --dport 2222 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A INPUT -i eth0 -p tcp --sport 2222 -m state --state ESTABLISHED -j ACCEPT


# allow HTTP
iptables -A INPUT -i eth0 -p tcp --dport 80 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -p tcp --sport 80 -m state --state ESTABLISHED -j ACCEPT

# allow HTTPS
iptables -A INPUT -i eth0 -p tcp --dport 443 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -p tcp --sport 443 -m state --state ESTABLISHED -j ACCEPT


######
# PROTECTION TO AVOID GETTING KICK OUT OF THE SERVER
# the idea is we will run this script, if the firewall is badly configure, it will be disabled after 120seconds so that you don't have to call support saying them: I lost access to my machine ....
# if everything is fine, then quit the script before the protection kicks in
sleep 120
echo "- PROTECTION: RESET OUR FIREWALL CONFIG"
iptables -F
iptables -t nat -F
iptables -t mangle -F
iptables -X
iptables -P INPUT ACCEPT
iptables -P FORWARD ACCEPT
iptables -P OUTPUT ACCEPT
EOF

Read this script and understand it first, making relevant change that apply to your specific use case. If you haven’t made any special customization to the configuration I gave you earlier, you should be good to go:

sudo bash /tmp/firewall.sh
# wait for a few seconds
# then Ctrl-c to quit the script. If you can still type things, you're all set.
# If you screen is frozen, it means you got locked out of your own server! But don't panic and wait 2 minutes. When you get your access back edit the script by making relevant change to your configuration and start again.

Setup your apps

Achieve segregation between apps

why would you want to separate all the apps from each other?

It’s mainly to solve 2 problems:

  1. The security problem. If an attacker break into your server because one app has a security issue, he gains access to everything.
  2. The version problem. An example would be with 2 apps that need 2 different version of nodeJS. yes you can fix this specific issue for nodeJS by introducing more tooling you will forget next week but you will probably also encounter the same issue for PHP, python, Ruby, …. At the end of the day, you will end up spending more time hitting your head on the wall trying to fix those problems while you probably have better things to do.

To make all your apps independent from each other, there’s a wide range of techs available to you but you can group them in 3 group of solutions:

  • 1 machine per application. Basically you’re buying a new machine every time you need to add something new. Don’t do that unless there is no other choice.
  • VM based solutions: Vmware, Proxmox, ….
  • Container based solutions: docker, ….

Choosing between one or the other is a trade off:

  1 machine per application VM Based solution Container Based solution
Overhead in resources very large large tinny
Security best best ok
Ease of maintenance hard hard easy
Ease of migration hard hard easy

If you’re looking into a cost effective solution, containers are great considering VM based solutions will always consume more resources. On the other hand, VM will be more secure by achieving real process isolation (your apps will run on a different kernel whereas containers based solution will use the host kernel).

For the purpose of this guide, we’ll focus on the container based solution using docker as my motivation was to be as cost effective as possible and the value of the VM approach didn’t outweight its cost.

Install apps

How it works

Nothing better than a good old school schema to gain a better understanding of what we’re building here:

How it works: When a user attempt to visit: app1.domain.com, the request first hit the reverse proxy. The reverse proxy’s role is to forward the request to the container running the service we want to see associated with app1.domain.com and send back the container response to the user’s browser.

A few things here:

  1. Reverse proxy: we will use a software called nginx to achieve this. It will be install on the host and listening to port 80 and 443 (corresponding to HTTP and HTTPS). When the reverse proxy receive a request on a domain it knows, it forwards it to the proper container in the same way that a traffic controller manage incoming traffic when you go on a flight. Our services will only be accessible through the reverse proxy adding security and improving loading speed
  2. Containers: They are the building blocks running our applications. They will expose their service through the loopback ip (aka localhost) on a certain port so that nobody can access those directly from the internet.

Install: Issue the following command:

sudo apt-get update
sudo apt-get install nginx docker docker-compose

Skeleton of an app

I wanted to share here how most of the apps you might be interested in installing are made of:

  1. The code of the app is the only mandatory piece, it is what makes it unique. Essentially you might need to compile the source, install dependencies for the code to run and install some sort of other software to run the so called code. It all depends on how the application was developed in the first place. For example, if the app is made with:
    • golang: the application will be run as a fat binary which means the binary contains all the libraries it needs to run on your environment without having to tweak anything
    • nodeJS: the application will be run by calling some sort of command that start either with npm or node after you’ve installed all the dependencies
    • Java: the application will be run either by calling the java command on the jar representing your application or some sort of Tomcat server
    • PHP: the application will be run by either Apache or Nginx in a special module able to interpret the code and execute it
    • ….
  2. Database are third party systems that are very frequently used by the core application to store/retrieve information. It’s not a mandatory thing but most applications requires the use of a database. Example of database you are very likely to encounter: Postgres, Mysql, sqlite, MongoDB
  3. Less frequently, you might see some other systems you might have to install to make your application working:
    • some sort of messaging queue: RabbitMQ, Kafka, …
    • some sort of search component: solr, elasticsearch

You can see applications that are made of any combinations but those are the most frequent:

  1. 1 and 2
  2. 1
  3. 1 and 2 and 3

Install an application

Now we have all the tooling in place we will give the process to install an app on our server. As an example we will start with 2 apps:

  • Nuage: a web based client to manage the files on your server. As shown in the section above, nuage only need 1 to work
  • Lychee: a web based tool to store / view your pictures. As shown in the section above, lychee needs 1 and 2 to work

We will do a few things here:

  1. Configuring our reverse proxy to serve our application on the internet. Let’s say it will run on the domain: nuage.domain.com, photo.domain.com
  2. Build the application container that will run the code for our apps

  3. Configuring our reverse proxy

    Nginx has several directories you are interested in:

    • /etc/nginx/sites-available/: which contain a list of file each corresponding to an application you will be running
    • /etc/nginx/sites-enabled/: which contain the list of application you are currently running

    Each application will have some configuration file on the reverse proxy

    cat << fdfdsf <<EOF
    server {
        listen 80;
        server_name files.domain.com;
        client_max_body_size 1024M;
    
        location / {
            proxy_set_header        Host $host:$server_port;
            proxy_set_header        X-Real-IP $remote_addr;
            proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header        X-Forwarded-Proto $scheme;
            add_header              Strict-Transport-Security "max-age=63072000; includeSubdomains; ";
    
            proxy_pass          http://127.0.0.1:10000;
            proxy_read_timeout  90;
    
            gzip on;
            gzip_comp_level 6;
            gzip_vary on;
            gzip_min_length  1000;
            gzip_proxied any;
            gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
            gzip_buffers 16 8k;
        }
    }
    EOF
    
    ln -s /etc/nginx/sites-available/owncloud.conf /etc/nginx/sites-enabled/owncloud.conf
    
  4. Enable https on your sites

    Http usage is decreasing over time as more and more website use https. Self signed certificates are good for development but if you do it wrong, chances are you can be victim of a man in the middle attack. When I say do it wrong, I mean doing either of those mistakes:

    1. don’t let your OS know about your self signed certificate
    2. don’t verify your actual SSL certificate before clicking on “I understand the risk and I want to pursue the navigation anyway”.

    If you don’t want to spend money on creating SSL certificates, you can still use Let’s encrypt, a free provider of SSL certificate. As Let’s encrypt is free I can’t advise to use it for something you make money from as it doesn’t come with proper support. For example, the other day their server was down and I couldn’t complete the setup of an app, had to wait until they were back online.

    Installation: To generate SSL certificate, we will need to install a tool called certbot:

    sudo add-apt-repository ppa:certbot/certbot
    sudo apt-get update
    sudo apt-get install python-certbot-nginx
    

    To make sure, certbot was installed correctly, type the command:

    certbot --version
    

    You should be greet with something like certbot 0.14.2. If you get bash: certbot: command not found, then something went wrong, dig up the install documentation.

    Create a certificate: Once certbot installed, generating an SSL certificate is simple:

    1. Make sure you have create a new subdomain pointing to your server. Usually a CNAME record that looks like this: “appx.domain.com -> domain.com
    2. Type the following command:

      sudo certbot --nginx -d domain.com -d app1.domain.com -d app2.domain.com -d app3.domain.com
      

      If you need to setup a new application just use the same command by appending the new domain at the end. This should create a bunch of files under /etc/letsencrypt/ but we’re only interested in 2 things:

      • the newly generated certificate: /etc/letsencrypt/live/domain.com/fullchain.pem
      • the newly generated private key: /etc/letsencrypt/live/domain.com/privkey.pem

      The nginx configuration for our application will be something like this:

      server {
         listen 80;
         server_name app1.domain.com;
         return 301 https://$server_name$request_uri;
      }
      server {
         listen 443 ssl;
         server_name app1.domain.com;
         ssl_certificate /etc/letsencrypt/live/domain.com/fullchain.pem;
         ssl_certificate_key /etc/letsencrypt/live/domain.com/privkey.pem;
         ....
      }
      

    Renewing a certificate: Using lets encrypt, you will have to renew your certificate every 3 months by executing the following command:

    sudo certbot renew
    

    In practise, I have my root user executing the command in a cronjob. In other words, everything is done automatically without requiring any human intervention. To do this:

    sudo su # type your password
    crontab -e # should open a file you can edit
    

    At the end of the crontab, add the following:

    # renew ssl certificates
    * * 1 * * certbot renew
    
  5. Build the application container

    It can be more or less easy to setup your container. Best scenario is there is already a nice container and compose file maintained by the project directly

    The install instruction are given here. Like most the application you probably want to install, lychee is made out some code (PHP in our case) which connect to a database to store and retrieve its data.

    We will use a tool called docker-compose to manage our application which will make things much easier for us. At its core, our application container will be configure with a docker-compose.yml file which describe what our application is made of.

    To do that, we must first create our docker-compose.yml file:

    mkdir data
    cat << EOF
    version: '2'
    services:
      db:
        container_name: lychee_db
        image: mysql
        restart: always
        environment:
          MYSQL_DATABASE: lychee
          MYSQL_PASSWORD: TnFp4k6J
    5y9APu6C
          MYSQL_ROOT_PASSWORD: TnFp4k6J5y9APu6C
          MYSQL_USER: lychee
        volumes:
        - ./data/db:/var/lib/mysql
    
      lychee:
        container_name: lychee_app
        build: ./img/
        restart: always
        ports:
        - "127.0.0.1:10000:80"
        volumes:
        - ./data/code:/var/www/
        depends_on:
        - db
    EOF
    

    Our application is made of 2 services:

    • db: which is the database lychee will use to store/retrieve data. It uses mysql and as we have an official docker image for mysql, we will use. The image documentation specify a few things we can configure such as username and password and a database name. By default, we want the container to be restart if the mysql process fail and we want the entire state of mysql (/var/lib/mysql) to live in the host. That way we can backup and recover its state without losing data.
    • lychee: which is the container that use the lychee code. It uses made of a custom image that run an Apache server with a few extensions to run lychee seemlessly and expose to the host from port 10000

    Creating our custom image: The goal here is to create a docker image we can use to launch our application. As a starting point, we create a Dockerfile skeleton:

    cat Dockerfile << EOF
    FROM ubuntu:latest
    MAINTAINER mickael@kerjean.me
    
    RUN mkdir data
    
    EXPOSE 80
    VOLUME ["/data"]
    WORKDIR "/data"
    
    EOF
    

    We will fill this skeleton as we progress in building the environment to run our application:

    sudo docker run -ti ubuntu bash
    

    Once we’re inside a container, we can fool around and determine the packages we need in order to run our application correctly. After trial and error, I ended up with:

    mkdir img
    cat << EOF
    FROM ubuntu:16.04
    MAINTAINER mickael@kerjean.me
    
    RUN apt-get -y update && \
        apt-get install -y git && \
        #####################
        # INSTALL SYSTEM DEPS
        apt-get install -y apache2 php php-mysql libapache2-mod-php imagemagick && \
        #####################
        # INSTALL APPLICATION
        cd /var/www && \
        git clone https://github.com/electerious/Lychee && \
        chown -R www-data:www-data Lychee && \
        #####################
        # CONFIGURATION
        sed -i 's/DocumentRoot \/var\/www\/html/DocumentRoot \/var\/www\/Lychee/' /etc/apache2/sites-enabled/000-default.conf
    
    RUN apt-get install -y php-imagick php-common php-json php-curl php-cli php-gd php-mcrypt
         # CLEANUP
    #    apt-get -y clean && \
    #    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
    
    RUN sed -i 's/max_execution_time.*/max_execution_time = 3600/' /etc/php/7.0/apache2/php.ini
    VOLUME ["/var/www/Lychee/data", "/var/www/Lychee/uploads"]
    ADD entrypoint.sh /
    CMD ["bash", "/entrypoint.sh"]
    
    WORKDIR "/var/www"
    EXPOSE 80
    EOF
    

    This is where you will likely most of your time while creating an application. Thinking about everything pieces that needs to be install to make things work. It’s a lot of trial/error thing until you end up with the perfect image.

    Once created, we will be able to start and stop our application easily with:

    # start an app
    docker-compose up -d
    # stop an app
    docker-compose down
    

    Open up a browser and navigate to: http://app1.domain.com, you should be greet with the lychee homepage:

    TODO image of lychee

    We might think this is a lot of work to do and you would be right. To make things a bit easier to do, I shared my secret for the app I’m using in the next sections so that everything should be easier for you than it was for me.

  6. Common pitfalls using docker

    There’s a lot of docker-compose available on the internet. Here is a checklist to use before deploying another server:

    1. Make sure the port section of you container redirect to the loopback ip. By default docker would open a hole in your firewall and that would circumvent your nginx proxy and thus your web application firewall.
    2. Make sure the application state is store in the host filesystem (database, custom configuration, files, …). In practise, if you’re using mysql you need to create a volume to /var/lib/mysql or /var/lib/postgresql/data for postgres. I see a lot of people trying to use different hack to backup db but really, the easiest you can do is to backup the filesystem and it will work as soon as you stop your container before doing a backup. 2 years ago, I had this experience where the

Painlessly install something

I’ve created a github repo to make the process as easy as possible (feel free to contribute). Basically you can clone this repo:

cd /tmp
git clone TODO link
cd selfhosted

We’ve placed a Makefile at the root of repo. We’ll use it to automate everything from installing/starting/stopping/backup our applications.

Let’s say you want Mattermost (alternative to Slack):

echo "mattermost" | make install

To install Lychee (store all your pictures/create album):

mkdir /app && cp -R /tmp/selfhosted/lychee /app && cd /app
make lychee_install
make lychee_start

To install Owncloud (alternative to dropbox):

mkdir /app && cp -R /tmp/selfhosted/owncloud /app && cd /app
make owncloud_install
make owncloud_start

Maintenance

Backup

Approach

A golden rule for backing things up is, try to recover from a backup before an actual problem occur. You don’t want to end up with backup that can’t be restore for some reasons.

Looking at tooling there’s a wide range of solutions available: rsync, rclone, ….. The difference between those isn’t big and they all fill a specific niche some other tools don’t fit in. At the end of the day, the result will be the same as your data will be somewhere safe is something goes south. Those tools usually provides you with different strategies for backing up data:

  • differential backup
  • incremental backup

If you don’t know about those strategies, go take a look here

Dead simple approach without tooling

I personally have a ftp server available for backup as it came for free with my dedicated server. That’s the cheap backup option but work for my needs.

Trying to get it right, I ended up spending way too much time one the tooling. At the end I didn’t succeed to restore all my applications correctly as I ran into some permission issues that couldn’t be set properly on the backup server. I finally went with a full backup approach, creating compressed archive (tar.gz) files for all of my data and piping it to the ftp server.

Concretly, my backup is run from the root user with a cronjob that look like this:

crontab -l
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command
0 0 * * * cd /app && PERIOD=daily make backup
0 0 * * 0 cd /app && PERIOD=weekly make backup
0 0 15 * * cd /app && PERIOD=monthly make backup
0 0 1 */3 * cd /app && PERIOD=quaterly make backup
0 0 0 0 0 cd /app && make upgrade

The Makefile has many lines but is constructed in this way:

backup:
        make mattermost_backup
        ....
mattermost_start:
        cd mattermost && docker-compose up -d
mattermost_stop:
        cd mattermost && docker-compose down || true
mattermost_backup:
        tar -zcf - mattermost | ncftpput -u$(FTP_USERNAME) -p$(FTP_PASSWORD) -c $(FTP_HOSTNAME) mattermost_${PERIOD}.tar.gz

Restoration

Can’t say it enough, test it before it’s too late.

Security upgrade

Once in a while, some large securities issues popup (Heartbleed, CVE-2017-5638, …) and if you don’t take security seriously, it won’t be long until your server is owned by somebody else. To avoid this, you need to regularly go to your server and do the following:

  1. upgrade packages on the host machine:

    sudo apt-get update
    sudo apt-get upgrade
    
  2. upgrade package from your containers. Basically, in my makefile:

    upgrade:
          make mattermost_upgrade
    matter_upgrade:
          cd mattermost
          docker-compose build
    

Improvements

Application Performance

Improve loading speed can be done in different ways:

  1. Optimise performance of the database: You can configure and optimise a lot of parameters in your database. Performance is a function of the available hardware which by default is usually set to low resource consumption & bad performance. If you thicker your database configuration, you’ll likely get big improvements. Those links should help you get started:
    1. Postgres: http://pgtune.leopard.in.ua/, https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
    2. Mysql: https://www.percona.com/blog/2014/01/28/10-mysql-performance-tuning-settings-after-installation/
  2. Use and abuse of gzip compression: Because loading speed matter to browser vendors, it’s almost certain your browser support decompression of content on the fly to minimise the amount of bite sent accross the wire and thus improving loading time. Use and abuse of this unless you have good reasons for not doing so. You can do that by tweaking your nginx configuration for the site:

    server {
      ...
      location / {
        gzip on;
        gzip_comp_level 9;
        gzip_vary on;
        gzip_min_length  1000;
        gzip_proxied any;
        gzip_types text/plain text/css application/json application/javascript;
        gzip_buffers 16 8k;
        ...
      }
    }
    
  3. Use a CDN: In case the application you’re trying to install is a shoehorn like wordpress, using a CDN will likely improve the overall performance. Basically pre rendered page will be store on a network of machines and serve to your user from there, giving almost no work to your server. Akamai is the big enterpise name in this space but you can use Cloudflare as well. I won’t advise to go with Cloudflare as:
    1. There are serving too much of the web already (10%) and they have a track record for not being neutral
    2. To access some Microsoft product I’m force to use at work, I need to change my User agent on my browser (Qutebrowser). When I do that, they’re denying me access to every website using their service.
  4. Optimise the code: Find your bottleneck and optimise the existing code. Contributing code to the original repo is usually very appreciated as you’ll make things better for every user

Monitor your services

It’s not a must have but it’s nice to see which service is down and proactively fix things before somebody figured out your offline. There’s many solutions available but we’ll discuss the one I have in place which is using Jenkins.

Jenkins wasn’t design with this use case in mind but it’s doing a great job at it. Example of my health check dashboard:

Every line gives the status of a service and I get to see what’s working or not. To do the same you’ll have to:

  1. Install Jenkins - see the github repo for this guide
  2. Configure Jenkins (you might be interested in the following plugins: Hudson post build task and simple theme (with this CSS for a better look and feel))
  3. Create a new job named after your service you wish to monitor
  4. Configure the job:
    1. under build triggers, click on build periodically. I personnaly trigger the job every 2 minutes: “H/2 * * * *
    2. under build command: “curl -L -X GET https://gitlab.com/users/sign_in | grep -q 'About GitLab’”
    3. Configuring the “Hudson post build task” plugin that way:
      1. log text field: “build as failure
      2. script field: “curl -L -X GET https://gitlab.com/users/sign_in

It takes no more than 5 minutes to add another service. Once you did it several times, you will end up with the same dashboard as mine and that’s pretty cool to have a health check of everything you got available from one place

Another point about backup

Another thing you might want to consider is not being vulnerable to a ramsonware (WannaCry, …). If your server can access backups, anyone who got an access to your server can trash those backups.

You might want to do it the other way around where the server never have any access to the backup.

The idea here is to have the backup server (eg: a raspberry pie) to log into your server, get the backups and let them in a safe place. This way, you can always recover from something, no matter what.

HTTPS

Here is 2 tips we can apply here:

  • using HSTS to improve security
  • SSL all the way down to your container

HSTS is a very cheap/simple security measure protecting your user against man in the middle attack (eg “New Tricks For Defeating SSL In Practice”). Put simply, you send the following header along your request:

Strict-Transport-Security: max-age=31536000

This will tell your browser that all the ressource on the page will be load over HTTPS even if you’re trying to load something over HTTP. In our case, we will put this in our nginx configuration:

server {
  ...
  location / {
    add_header Strict-Transport-Security max-age=31536000;
    ...
  }
}

SSL all the way down to your container: As you may have notice, the SSL connection is only encrypted until it touches the reverse proxy, everything after (Proxy -> container) is unencrypted which is a risk you might want to control. To be honest, I don’t mind this risk so I haven’t spend time on a solution that’s easy to maintain, you’re on your own here if you to do this.