There is now some significant continuity in the systems on which we run our home and our businesses-from-home. Although I ran home servers before that date, we can trace a direct line from our existing server to an original Ubuntu Warty Warthog server built in 2004. That server was updated at regular intervals, but around 2012, I moved the systems to pure Debian. Debian seemed less particular than Ubuntu at that stage, and its long term stability seemed better than Ubuntu.
I have written about our home systems, which are heavily influenced by the fact that we are off-grid, and need to be very careful about our power use, especially here. I noted at the end of that article that migrating the server to a Raspberry Pi would save yet more power. The Intel Atom-based mini-itx board was not great on power consumption, and used 18w of power while at idle, going up to around 22w when working hard. That contrasts with 8w of power that my AMD_based laptop uses. I would have loved to find low-power AMD part mini-itx board, but can't find one anywhere. That would allow me to stay with the x86 architecture, and even have the ability to run virtual machines. While there is nothing wrong with the existing system, running Debian Wheezy, at some stage I will need to update it to Debian Jessie. But with the cruft of many years on the server, I think it is time to start a fresh, and migrate data to a newly built system. So the idea of moving the system to the low power of a Raspberry Pi came around again, and this time, the extra power and memory of the Pi 3 would be useful. I was a little concerned, because from time to time, the server uses as much as 400MB of memory, of the 2GB that the Atom system had installed, but I would be restricted to 973MB on the Pi.
The services we run are:-
- EMail (dovecot, postfix plus security enhancements anti-spam etc)
- Database services - Postgresql and MySQL
- Groupware (now morphed to local "cloud" services) - calendars, address books, file syncing - Nextcloud
- Weather station and history
- DLNA local multimedia services
- Secure chat
- Nagios system monitoring
The Raspberry PI 3 can now boot directly off a USB disk, but there are some advantages to doing the initial boot off the microSD card, such as being able to run an initramfs to support file systems other than EXT4. It may also help with disaster recovery, to make the boot process slightly more portable. Now, for the past year or so, a friend, who wanted some home-based services, has been running a Pi against a USB disk, and it has turned out to be reliable and resilient. So although there is much about how a Pi works that is contrary to conventional thinking, I was confident these conventions would almost be more like prejudices than reasons for a Pi being inappropriate as a server.
So I built up a Raspbian-based Pi system, and started syncing the 1.2TB of data that the existing system maintains. Helen was due to be away for a couple of days, and her travelling time gave me a window to get the new systems going.
All went well with the data migration. I took dumps of the Postgresql and MySQL databases and the restored them on the new system. Email services were shut down and final rsync of the mailboxes - about 20GB - was done. Then I started bringing up the new services. DNSMasq, which provides not only DHCP addresses and a locally-cached DNS service, but also a lot of useful wheezes, like ad blocking and making testing of, for example, MX records easy, was stopped on the old server and started on the new.
Email, the most important service, just worked. I started afresh with spamassassin, so it needed to be re-trained, but that was easy with against the existing Junk folders. Nagios, which monitors 14 hosts and 33 services on three sites, mostly worked, but two servers would not respond. I turned it off while I worked on other services.
Weatherview (http://www.wviewweather.com/) was a problem. The sqlite databases could not be copied, because the different ARM architecture of the Pi, in comparison with the X86 of the Atom, caused an incompatibility. This doesn't make much sense, as one would expect the on-disk format to be the same, but it was the clear error message, so I resorted to dumping the databases on the old system and restoring them, the same as the "proper" databases. I had not realised the amount of customisation I had done to this system, and it took some hours to re-wrap my head around how it all worked, and what databases and text files to bring across. The result was bizarre - temperature was reported as much higher than it was, and so on, but eventually I realised it had something to do with recognising the USB input from the weather station. A udev rule sorted that out and finally the weather station was working as before, with all history accessible.
Nagios was frustrating. It felt like an ssh issue. Running remote nagios checks via ssh sometimes fails if an initial manual ssh connection has not yet been made, as a manual acceptance of the keys is needed, but it could not have been that as it was only 3 of the remote systems that had this problem. It took a lot of puzzling to realise that it was actually a DNS issue. I did not copy the old /etc/hosts file that DNSMasq uses, but started afresh, and for some reason, Nagios wanted the remote internal IP address, which are accessed via a nested SSH connection, rather than passing it on as a request to the remote SSH server. Adding the two missing entries, which was probably a copy error on my part, resolved the problem.
The myriad little things you forget that go to making a running system were then re-created or copied across, such as the firewalling rules, and I took easily accessible copies of the old configuration files for everything. I decided to build the configuration of fail2ban, an essential part of the security model, from scratch, as it was one of the cruft-filled sets of configuration that date back a decade or so, and needed to be cleaned.
Now everything was working perfectly, with all previous services replicated on the fresh system. Memory use was manageable, and the most it has touched is 280MB of the 973 available. I started becoming more confident that the solution would work, so the time came to move the new system, which was still rather naked, the bare board and two USB disks, which will eventually be migrated to a single 2TB disk, and which was just lying on the desk. So I shut everything down, and re-routed the cables to a more permanent location.
And the system refused to boot.
I connected up a keyboard and monitor, but the screen message appears to be an SD card error. That seemed odd, but I removed the SD card and stuck it into my laptop. It came up sweetly. I then tried the USB disks, and one, the main one, inevitably, just generated errors, but not disk errors. The problem was a loose USB cable for the disk. I swapped them around, and while still a little sloppy in the slot, it booted without problem.
The following morning, the weather station reported a maximum wind speed of minus one million miles per hour. That turned out to be my nerfing the weather station power supply when moving cabling around, so restoring power and re-connecting the USB cable sorted that out, for the loss of overnight data from the powerless system.
Just as with a corporate version of such a migration, there have been many little scripts, backup utilities and services, which have been forgotten about but which need attention. Examples are a local service using xtide to show us local tide times, cron jobs to update databases with weather data, security reports and so on. The most significant example was RRD graphs. Apart from system uses, such as the postfix mailgraph utility, I use and RRD database to generate weather graphs. However, the scripts came up with "ERROR: This RRD was created on another architecture". I would have thought rrd would be architecture-independent, but this is not the case, even from 32 to 64 bit. I had to dump the database to an xml file on the old system (rrdtool dump oldrrd.rrd > oldrrddata.xml) then import the result on the new architecture (rrdtool -f oldrrddata.xml newrrd.rrd). These were just little issues that I expect will crop up over the next little while and need to be dealt with one by one.
A bigger issue was that, with the newer versions of postfix and dovecot, none of our Android tablets or phones would retrieve mail, coming up with a strange SSLV3 error, which is bizarre as SSL3 is disallowed on the server. This turned out to be related to using a proper Letsencrypt certificate rather than the older self-signed certificate, so the setting on the tablet email accounts had to be just "SSL/TLS" rather than "SSL/TLS Allow all certificates". One phone, running Android 6, remains stubborn, but no doubt this additional difficult with making such a migration will also soon be resolved. It's easier on a home system than a corporate one, as when these issues arise, I do not have to cope with someone using the problems for exercising one-upmanship or exerting power over a trying situation, and not helping to resolve the issues at all. Other than the inter-personal issues, the technical and operational process between our experience and the corporate world is exact, and I feel the same sense of responsibility over my work.
So far so good. The power consumption is 6w at idle, going up to 8w when the system is working hard, so a third of the Atom system. The power of the ARM processor cores is entirely adequate, and the lower disk throughput is not really noticeable, although the Nextcloud web access is a little slower. The only change I have made to various scripts is no longer to use the pxz parallel compression utility for backup files. This results in a great compression ratio, but is very demanding on system memory. I have altered the affected scripts to use pbzip2 instead.
What a brilliant invention the Pi is. It challenged the culture of bigger, faster, hotter that characterises the concept of "improvement" in the IT industry and makes you think about some technical aspects we tend to take for granted. But it can be reliable and resilient, so we will see how long before the next set of changes is necessary,
Edited to add: A week after the changeover to the Pi, we are happy that there is little or no noticeable difference to running our services as before, except that the Pi uses about 6w at idle, to 8w when both disks were in use. (I kept the old system, which was on a 2TB disk, and built the Pi using 2x 1TB disks.) Now that the system is stable, I copied everything over to the 2TB disk, and idle power consumption is down to 4w, while start-up power is a mere 6w. I would say a difference of 18w for the old system compared to 4w for the new is exactly what we hoped to achieve, and the transition from the old Debian Wheezy x86 system to Raspbian Jessie ARM is completed satisfactorily.