Low power consumption home backup server project🔗
Posted by Médéric Ribreux 🗓 In projects/
Introduction
For many years, I lived without backups. This was really bad and I needed to overhaul the situation. Because, nowadays, in this digital era, no backup means that you put a risk on your valuables.
With time, I finally managed to build a backup system that fulfills my needs and here is how I have done it…
What I wanted to answer?
First of all, I would like to describe what a backup system should be for me:
- It definitively should be easy to use because if it's too much complicated, you will not use it!
- It should be robust: based upon solid and proven software.
- It should be free software, period.
- It should support encryption: everything of my private life will be in those backups. No time to make it clearly available.
- It should be fast and efficient: if I have to wait for more than two hours to make just a weekly backup, I will finally not use it!
- It should be able to backup all of my digital stock.
- It should consume the fewest possible amount of electricity.
- It should be affordable and not expensive to use.
What needs to be backup ?
Almost everything that is precious for me. Here is a kind of list:
- Digital photos.
- Blog and
/var/www
stuff. - Code.
- Configuration of computers (
/etc/
stuff). - Emails.
- Movie collection.
- Music collection.
- Document archives (scanned PDF).
- Ebooks.
For the moment this represents a maximum of 2TB. I have capped my backup system to a maximum of 3TB.
Backup policy
Before trying to backup anything, I tried to build a policy. When will I do the backup and how much time will I keep the data ?
To make the best choice, here are the facts:
- I mainly work on the LAN during the week-end (other days, I am working at the office).
- Files will be changed a lot more during the week-end than during the week.
- One weekly backup should be sufficient to handle a minimum work loss.
- There will be files that will not change at all for large amount of time.
- Each year I am making a review of my projects and my digital documents. I need to be able to grab a year old backup.
To answer this I have the following backup policy:
- One backup per week (weekly).
- Store 3 weekly backups.
- Store 5 monthly backups.
- Store 1 yearly backup.
And what about making some backups in the cloud?
As far as the third quarter of 2014, I believe that cloud based backups cannot compete with the solution I have built. Actually, you have to consider the backup duration. My ADSL maximum upload rate is about 100kB/s. To upload 3TB, I have to wait for more than a year ! Restoration will also take a lot of time: a little bit more than 2 months (at 550kB/s) ! That's too much time…
Furthermore, you have to consider the prices. Here is a little sum up:
Topic | Initial Costs | Cost for 3 years | Cost for 5 years |
---|---|---|---|
Hardware | 10€ (old stuff) | 10 € | 10 € |
Hard disks | 300€ (3x3TB) | 300 € | 300 € |
Power | 0.5€ (initial backup @ 70MB/s) | 9.5€ (3h a week = 16kWh/year) | 15.5 € |
Total owned solution | 310.5 € | 319.5 € | 325.5 € |
DropBox 99€/1TB/year | 300 € | 900 € | 1500 € |
Amazon Glacier 396€/3TB/year | 393 € | 1188 € | 1980 € |
Google Drive 120€/1TB/year | 360 € | 1080 € | 1800 € |
Microsoft OneDrive 252€/3TB/year | 252 € | 756 € | 1260 € |
OVH Hubic 120€/10TB/year | 120 € | 360 € | 600 € |
As a conclusion, I decided to go on my own backup solution, "clouded" at home!
Risk analysis
What are the risks I am facing on this backup topic ? I have made a really light risk analysis:
Risk | Probability | Impact | Solution/Workaround |
---|---|---|---|
Loss of data on a workstation made by mistake | High | High | The backup server itself. |
Hard to make a backup | High | High | Use efficient backup software. |
Test backup/restoration from the client point-of-view. | |||
Make a well designed and documented backup script. | |||
Scarced disk space | High | High | Use a deduplication backup software. |
Whole backup server stolen | Low | High | External backup disk + Encryption. |
Backup server Data hard drive stolen | Low | Medium | External backup disk (6 months of data loss). |
Backup server System hard drive failure | Low | Low | Have a dedicated DRP for operating system. |
Backup server Data hard drive failure | Low | High | External backup disk (6 months of data loss). |
External Data hard drive failure | Low | Low | Buy another disk and make another external backup. |
Data are stolen | Low | High | Disk encryption. |
Consume too much power | High | High | Starts the server only when requested |
Shutdown the server at the end of the backup. |
As the analysis tends to prove, I should use an external backup solution and also a deduplication backup software.
Hardware
Introduction
As I have decided to build my own backup solution, I chose to build a dedicated server for this. I could add a new disk to my main workstation but, as I needed more stability, I could not use this system, just because my main workstation runs on Debian testing and not stable. Furthermore, adding a dedicated disk to my workstation means that this disk is consuming power (even when idle) all the time my workstation is up. But I only do backup once per week!
I could bought a true NAS. QNap is making very light power consuming servers for home. But they are quite expensive. Furthermore, spinning disks consumes up to 10W even when the disk idles… Combined by the fact that a true secured installation needs at least a RAID1 array of disks and you've got more than 20W just for making some backups. Consider that 20W on a year means about 20€. With an entry cost of about 250€ for a QNap NAS with 2 bays, this is quite expensive (for my point of view).
So I switched to an old computer that have 4 SATA ports and a chassis that is able to store at least 3 disks.
The backup server will only be powered when a backup is due.
Let me introduce you to Goofy
Here is Goofy, my main backup server:
I have built the disk enclosures using scrapped CD-ROM and DVD-ROM:
Hard disks
The RAID1 array is managed as a software RAID by the operating system (mdadm). I have not used LVM because each disk is formatted to its maximum capacity, which is much more simple for a not everyday sysadmin.
I have used those disks for the RAID1:
- Hitachi Deskstar 5300 3TB SATA 6GB/s
- Seagate Barracuda 7200.14 SATA 6Gb/s 3 TB
The system disk is independant from the datastore RAID1 array. It is a standard ext4 filesystem with no encryption.
For the external disk I have used a Western Digital Caviar Green 3 TB SATA 6Gb/s in a USB2 enclosure.
Each time the backup needs an off-site vaulting, the disk is plugged to the backup server and a command triggers a copy of every data from the RAID array to the external disk.
System components
After many tests an years, I have based my backup solution on the following software:
- Debian Stable as the Operating System.
- Borg Backup as the main backup software.
- Borgmatic as the backup "configuration" software.
- OpenSSH for communication between clients and backup server.
For the encryption, I have made some tests between the native encryption mechanism of Borg and cryptsetup (LUKS). I finally chose Borg encryption because it was nearly as fast as cryptsetup and very much simpler to handle. Furthermore, with Borg encryption, backup clients don't need to share the same encryption key!
Server system installation
Introduction
To build your backup server, you will need to follow these steps:
- Install Debian stable on the system disk.
- Add required packages.
- Configure software RAID1.
- Create a dedicated account for backup operations.
- Create an SSH key for this account.
- Build a backup script.
I will not describe Debian installation, the Internet is full of tutorials for this.
Packages
apt install borgbackup mdadm pip3 rsync pip3 install borgmatic
Dedicated user
We need to create a dedicated account for backup writings. We will create a user named borg
and only this user will be able to write on /media/backup/borg. This account should not be authorized to log with a direct login method (PAM) and should only be able to log with SSH with authorized keys mechanism.
Here is how we create such an account:
adduser --disabled-password --gecos "Dedicated Backup user" borg
Once created, we want to disable bash command history because we will pull the LUKS passphrase via bash before mounting it. So we need to trash everything in the history of commands of bash. The easiest way to do it is simply to redirect ~/.bash_history
to /dev/null
:
# su borg $ rm ~/.bash_history $ ln -s /dev/null ~/.bash_history
Notice that the history command will still work because it remembers the commands in memory while user has not ended its session.
We also need to create a dedicated log directory:
# mkdir /var/log/backup # chown borg:borg /var/log/backup
For SSH keys, well, use ssh-keygen as usual!
Configure software RAID1
We need to build a RAID1 array with the two data disks (2x3TB). The RAID will be managed by software and we are not using hardware RAID. The reference tool is mdadm
.
mdadm
can use raw disk device or partitions. You could think that whole disk is better because it is much more easy to do: you don't have to partition the disks. However partitionning can be very interesting on at least one use case.
Imagine you go with two 3Tb disks for 5 years. One day, one disk crash and you must replace it with a new one. In five years we can believe that disk capacity will have expanded and that you cannot go for 3TB disk anymore (just because they will be discontinued or simply less cheap on the GB/€ aspect). So you order a 6TB disk. In order for mdadm to rebuild the RAID array, you will be forced to partition your new disk with at least one 3TB partition. So, I think it is better to start with partitionning as you are nearly sure that will have to do it that way in the future.
But dealing with partitionning means we have to make some calculations before to respect disk alignment. The problem here is that the partition table size is not a multiple of the size of one block. So there will be misalignment with the block size of the disk. It means that when you want to write something on a disk, you have to make to I/O operations: one the first block and one on another block (which will be not full). When you do this on large files, it is not really a problem. But when you are dealing with files that are a little bit less than one block size, two operations will be done instead of just one and there will be less I/O performances. Obnam, which is the backup solution we are going to use on this backup server just use small blocks files (chunks of deduplicated data blocks) and there could be I/O performances problems with it.
So we need to align correctly our disks before using them on the RAID array…
Why no LVM ?
LVM is quite interesting when you want to deal with a volume which is aggregated on more than one disk. It is also easily resizable. For a few minutes, I wondered if I should use LVM or not. Then I dived through the use-cases.
First of all, every disk will be used at the maximum of its size. I use this configuration because it is generally simpler to install and to manage. There is only one partition that fills the disk. There is no need for resizing, if you are at the filesystem limit size, you just have to change your disk. Period.
Then, you have to consider that LVM is not interesting in this configuration because it adds unnecessary complexity by adding a layer just to better manage partitions (which is what LVM is). Instead of dealing with mdadm+cryptsetup+ext4, I will have to face mdadm+LVM+cryptsetup+ext4. Furthermore, when you want to resize your LVM logical volume, you have to resize the filesystem also.
One point that could change everything is the ability of LVM to do snapshots. But the only time we need such a feature is for backup externalisation on the third disk. Snapshots could be a benefit feature for clients but they are unfortunately configured with it, so we must do without LVM on this side of the backup.
Why not using LVM on the system disk ? For the same reasons: system disk will not change much once the system is installed. With proper log rotate, the disk will never be filled up (even with logs). There will be system updates but their size will never exceed disk size. The typical installation will take up to 4GB. The system disk size will be about 20 times this one (80GB). Furthermore, we have a dedicated system to generate installation images to recreate easily a new system from scratch in a kind of automated way. No need to bother with a fine disk partitions manager.
So we will not use LVM on the whole backup system.
Disk partitionning and alignment operations
For the disks I have bought, here is the physical size of one block:
# cat /sys/block/sda/queue/physical_block_size 4096
So, we have a value of 4096 Bytes which is the nowadays standard in the disk industry.
We are going to use GNU Parted to align our partitions. The first sector (physical) of the first partition will start at an offset of 1MB. If you do calculations, 1MB/4096B = (1024*1024)/4096 = 4096 which is a integer. So our first partition will be aligned on the 4096th block of 4096 Bytes.
Why 1MB ? Because it is the way GNU parted likes GPT partition alignment !
Here is the creation of our partitions:
# parted --align optimal /dev/sdX (parted) mklabel gpt (parted) mkpart primary 0% 100% (parted) name 1 RAIDPART (parted) set 1 raid on (parted) align-check optimal 1 1 aligned (parted) quit
Creation of the RAID array
Here are the instructions to build a RAID1 array with mdadm
.
# aptitude install mdadm -- Creating a new RAID1 array # mdadm --create /dev/md/BORGRAID --raid-devices=2 --level=raid1 --name=BORGRAID /dev/sdb1 /dev/sdc1 mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90 Continue creating array? y
Some explanations:
- We are using
mdadm
tool to create a new RAID device (which will be called/dev/md/BORGRAID
). This is done with option--create /dev/md/BORGRAID
. - We are using 2 raid devices so we are using 2 for the
--raid-devices
option. - The RAID array will be a RAID1 array and we tell it to
mdadm
by using--level=raid1
. - The RAID will be named BORGRAID (option
--name
). - The last two arguments are the name of the disks partitions,
/dev/sdb1
and/dev/sdc1
.
After creation, a device named /dev/md127
will be available as a raw disk device. The array is in building mode as you can see:
# cat /proc/mdstat Personalities : [raid1] md127 : active raid1 sdc1[1] sdb1[0] 2094016 blocks super 1.2 [2/2] [UU] [====>................] resync = 24.8% (519744/2094016) finish=0.8min speed=30573K/sec
Once it is built, we can have a little more details of its state with the following command:
# mdadm --detail /dev/md/BORGRAID /dev/md/BORGRAID: Version : 1.2 Creation Time : Fri Oct 17 17:47:55 2014 Raid Level : raid1 Array Size : 2096064 (2047.28 MiB 2146.37 MB) Used Dev Size : 2096064 (2047.28 MiB 2146.37 MB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Fri Oct 17 17:47:55 2014 State : clean, resyncing (PENDING) Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0 Name : BORGRAID UUID : ebd5aa0e:cf1222d0:2dbcd3e6:a7aefc03 Events : 0 Number Major Minor RaidDevice State 0 8 16 0 active sync /dev/sdb 1 8 32 1 active sync /dev/sdc
If you read the state, it says clean, resyncing (PENDING)
which means the RAID array is not already built. To launch RAID building, just use the following command:
# mdadm --readwrite /dev/md/BORGRAID
You will have to wait… time for a long coffee ! For a 3TB RAID array, RAID building takes about 315 minutes (about 5 hours) !. If you want to see the syncing level, just ask mdstat:
# cat /proc/mdstat
Keep the configuration
Everything is alright ! But we need to work a little bit more. If you reboot the server, no /dev/md/BORGRAID
device will be available. Just because mdadm keeps its configuration and arrays declaration in a config file which has not been modified: /etc/mdadm/mdadm.conf
.
You just have to use mdadm
once more:
# mdadm --detail --scan --verbose ARRAY /dev/md/BORGRAID level=raid1 num-devices=2 metadata=1.2 name=BORGRAID UUID=ebd5aa0e:cf1222d0:2dbcd3e6:a7aefc03 devices=/dev/sdb1,/dev/sdc1 # mdadm --detail --scan --verbose >> /etc/mdadm/mdadm.conf
By default, Debian (since Jessie) auto assemble the RAID arrays declared in /etc/mdadm/mdadm.conf
(thanks to systemd mdmonitor service).
To launch manually the RAID array:
# mdadm --assemble --scan
Automount the backup filesystem
Ok, our RAID1 will be ready at the end of the boot process and we want it to be automounted (because it is not encrypted). I put everything under /media/backup
, creates an ext4 filesystem and remove any root reserved blocks (to maximize filesystem size):
# mkfs.ext4 /dev/mapper/BORGRAID # tune2fs -m 0 /dev/mapper/BORGRAID # mkdir -p /media/backup
For mounting filesystems, I rely on systemd unit files (/etc/systemd/system/media-backup.mount
):
[Unit] Description = Backup filesystem [Mount] What = /dev/md/BORGRAID Where = /media/backup Type = ext4 Options = defaults [Install] WantedBy = multi-user.target
# systemctl daemon-reload # systemctl enable media-backup.mount # systemctl start media-backup.mount # mkdir /media/backup/borg # chown borg:borg /media/backup/borg
Your RAID1 array is now ready to accept backup data!
Backup script
I use this backup script, which is stored in /home/borg/backupall
:
#!/bin/bash # Script to backup all of my clients # global variables CLIENTS="medspxtower trick" LOGFILE="/var/log/backup/backup_$(date -Iseconds).log" MAILREPORT="sysadmin@mydomain.example" BORGMATIC="/usr/local/bin/borgmatic" # Initialize log file echo "Starting backup..." | tee ${LOGFILE} # Main loop for CLIENT in $CLIENTS; do # Find if client is connected /bin/ping -q -c 1 -W 2 ${CLIENT} &> /dev/null if [ "$?" -ne "0" ]; then echo "Can't find client ${CLIENT}, no backup for this client!" | tee -a ${LOGFILE} else # Then, do backup echo -e "\nBackup for ${CLIENT}:\n" | tee -a ${LOGFILE} ssh root@${CLIENT} "$BORGMATIC --verbosity 1" | tee -a ${LOGFILE} fi done # Don't forget to backup the backup server! sudo $BORGMATIC --verbosity 1 2 | tee -a ${LOGFILE} echo -e "That's all folks!\n\n--\nBackup Server" | tee -a ${LOGFILE} # Send an email to sysadmin account for reporting cat ${LOGFILE} | /usr/bin/mail -s "Backup report!" $MAILREPORT # Wait 10 seconds and poweroff /bin/sleep 10s sudo /sbin/shutdown -h exit 0
As you can read, we use this script to launch ssh sessions as root on the clients to launch borgmatic. The client borgmatic configuration indicates what needs to be backup on the client. As a result, you have to authorize multiple keys:
- borg@goofy SSH key on all the clients root accounts.
- root@client SSH key on borg@goofy account.
In order to automate a bit the system, you have to use systemd to launch the script whenever you start the computer (actually 3 minutes after boot). As of 2018, I have completely discarded crontabs!
Here the source of ~/.config/systemd/user/backupall.timer
[Unit] Description=Launch Backup at each startup [Timer] OnStartupSec=3 minutes Unit=backupall.service [Install] WantedBy=timers.target
And here the service file: ~/.config/systemd/user/backupall.service: ```ini [Unit] Description=Secured Backup Wants=backupall.timer
[Service] Type=oneshot ExecStart=/home/borg/backupall
[Install] WantedBy=multi-user.target ```
Remember to use systemctl to enable those units:
$ systemctl --user enable backupall.service $ systemctl --user enable backupall.timer
As a result, backup will launch 3 minutes after pressing the power button. The backup server will shutdown itself after backup operations. If you have configured an email service on the backup server, you will be able to receive a report of the backup operations. It can't be more easy to do a backup: just press the damn button and you are done!
Client backup software installation
You just need to do the following steps:
- Install borgbackup on the client.
- Install borgmatic.
- Authorize SSH backup server key for root access.
- Add a repository on the server.
- Modify borgmatic configuration.
Borg installation
# apt install --no-install-recommends borgbackup pip3 # pip3 install borgmatic
Borg repository
Now that we have installed borgmatic, it is time to build a repository on the backup server for this client. I have chosen to have one repository per client for security reasons. I really don't want to have a corrupted or malfunctionning client to wipe everything in the repository (or have access to everything). It will have impact on deduplication performances, particularly if your clients share the same files. But in my configuration, it is not really a problem.
Power the backup server and ssh as borg user (you have also to authorize root ssh key for the client on borg account). Remember to stop the timer before the 3 minutes wait:
$ systemctl --user stop backupall.timer
Then, you can create the repository:
$ mkdir /media/backup/borg/{client_name} $ borg init --encryption="my very long passphrase for making backup on this workstation" /media/backup/borg/{client_name}
That's all folks!
Borgmatic configuration
Borg has been built to be only a command-line tool. There is no real configuration for it. Borgmatic is a sort of encapsulation of borg backup launches relying on a configuration file.
Here is an example which can be put under /etc/borgmatic/config.yaml
:
# Where to look for files to backup, and where to store those backups. See # https://borgbackup.readthedocs.io/en/stable/quickstart.html and # https://borgbackup.readthedocs.io/en/stable/usage.html#borg-create for details. location: # List of source directories to backup (required). Globs and tildes are expanded. source_directories: - /etc - /var/log - /root - /media/data/Documents - /media/data/music - /media/data/photos - /media/data/archives - /media/data/games - /home/medspx - /media/data/movies # Paths to local or remote repositories (required). Tildes are expanded. Multiple # repositories are backed up to in sequence. See ssh_command for SSH options like # identity file or port. repositories: - borg@goofy:/media/backup/borg/{client_name} # Any paths matching these patterns are excluded from backups. Globs and tildes # are expanded. See the output of "borg help patterns" for more details. exclude_patterns: - '*.pyc' - '*/tmp/*' - '*/.cache/*' - '*/cache/*' - '*/.thumbnails*/' - '*/PlayOnLinux/' # Repository storage options. See # https://borgbackup.readthedocs.io/en/stable/usage.html#borg-create and # https://borgbackup.readthedocs.io/en/stable/usage/general.html#environment-variables for # details. storage: # Passphrase to unlock the encryption key with. Only use on repositories that were # initialized with passphrase/repokey encryption. Quote the value if it contains # punctuation, so it parses correctly. And backslash any quote or backslash # literals as well. encryption_passphrase: "my very long passphrase for making backup on this workstation" # Retention policy for how many backups to keep in each category. See # https://borgbackup.readthedocs.org/en/stable/usage.html#borg-prune for details. # At least one of the "keep" options is required for pruning to work. retention: # Number of daily archives to keep. keep_daily: 0 # Number of weekly archives to keep. keep_weekly: 3 # Number of monthly archives to keep. keep_monthly: 6 # Number of yearly archives to keep. keep_yearly: 1 # Consistency checks to run after backups. See # https://borgbackup.readthedocs.org/en/stable/usage.html#borg-check and # https://borgbackup.readthedocs.org/en/stable/usage.html#borg-extract for details. consistency: # List of one or more consistency checks to run: "repository", "archives", and/or # "extract". Defaults to "repository" and "archives". Set to "disabled" to disable # all consistency checks. "repository" checks the consistency of the repository, # "archive" checks all of the archives, and "extract" does an extraction dry-run # of just the most recent archive. checks: - disabled # Here you can add hooks to launch pre/post-backup scripts hooks: before_backup: - /root/myprescript.sh after_backup: on_error: - echo 'backup is fucked!' >> /tmp/fuckbackup.txt
Be careful, borgmatic is very sensitive to indentation and you will only have the information about what's wrong at the end of the backup. In case of troubles, increment verbosity on backupall
.
Externalisation
Now, we are able to make backups on a dedicated backup server but what if your house burn or your server is stolen? To face such risks you have to externalize backups. I have bought a third 3TB disk for this task and I have put it in a USB enclosure. Whenever I want to externalize, I just have to connect this external disk to the backup server and use rsync to copy the borg repositories.
I only externalize once per 6 months.
I have made a dedicated script for that operation. It is manually launched from /home/borg/externalize
:
#!/bin/bash # Script to externalize backups # Some global and immutable variables LOGFILE="/var/log/backup/externalize_$(date -Iseconds).log" MAILREPORT="sysadmin@mydomain.example" EXTBACKUP="/dev/disk/by-partlabel/EXTBACKUP" BACKUPREPO="/media/backup/borg" EXTMOUNTPOINT="/media/externdisk" # verify if external disk is up ## The partition 1 of external disk must be named EXTBACKUP if [ ! -e "$EXTBACKUP" ] then printf "Can't find external disk partition, aborting..." | mail -s "[Backup] Externalisation problems !" $MAILREPORT exit 3 fi # mount the external disk mount | grep -q $EXTMOUNTPOINT if [ "$?" -gt "0" ] then mount -t ext4 $EXTBACKUP $EXTMOUNTPOINT if [ "$?" -gt "0" ] then printf "Aborting: cannot mount the external encrypted device !" | mail -s "[Backup] Externalisation problems !" $MAILREPORT exit 5 fi fi # suspend systemd backupall.timer systemctl stop backupall.timer # make an rsync between them REPORT=$(rsync -a --stats "$BACKUPREPO" "$EXTMOUNTPOINT") # close properly sleep 2 umount $EXTMOUNTPOINT # send an email to tell it is terminated printf "Hello, Externalisation is done. Here is the rsync report: $REPORT You can now unplug the disk from the backup server. See you soon ! -- Backup externalisation program" | mail -s "[Backup] Externalisation completed !" $MAILREPORT # reactive systemd backupall.timer systemctl start backupall.timer exit 0
Conclusions
It has been more than 4 years that I am using this backup system. It is quite simple to use: just power the server and everything should be done. It has proven to be reliable and for the moment, I only use 1.2TB, thanks to deduplication.
The only drawbacks of this system is linked to network problems. Everything operates on SSH links and if you have a Wifi network, it will take time to make the backups. As a countermeasure, I have made the first backup, which is a full backup, using a direct computer to computer ethernet connection (1GBits for free). Sometimes I am facing SSH disconnections but it is always a wifi problem. As of 2018, nothing beats up a true dedicated ethernet cable network!
One other drawback is that you have also to remember to powerup all your clients before the backup! I have started to work on Wake-On-Lan procedures but in my case, it juste overkilling because I only have only two clients to backup.