Technical Design - Architecture
Saturday, January 19 2013.
The Assynt Community Digital Archive was designed to be a stand-alone network. recognising, among other issues, that there was a possibility of the need to move the Archive to a different location during its lifetime. In some instances, communities may simply wish to add a digital archive to an existing heritage project, but even under those conditions, working to different life-cycles may suggest a stand-alone solution. One of the technology infrastructure principles that has stood up to experience over the last 25+ years is that of separate servers for separate services. Some years ago that would have implied buying additional hardware to gain a spread of risk, to be able to change aspects of the infrastructure with minimal effects to other services, and to size the installation appropriately. These days, with virtualisation available both in hardware and software, achieving many of the benefits of the "separate servers for separate services" philosophy can be achieved with minimal hardware. So in general, the various services that go to make up a fully functioning stand-alone network are implemented as a series of virtual machines which themselves can be spread across physical machines as desired. There was a time not long ago that this was expensive or not very efficient. With the advent of KVM technology being incorporated into the Linux kernel, a fully Free Software solution in keeping with the standards-based, long-lived, cost-controlled demands of a community archive can be implemented. Other technology implementations to achieve the same outcome obviously exist. The virtual services are managed by the standard Linux utility virt-manager, and non-resident management tools can also be used to manage the systems. Having come up with how the services will exist, the services themselves should be defined. Some form of directory service should be installed, though this may be as simple as individual logins for the individual services. In our case, we chose OpenLDAP, again in the knowledge that the services that this provides can be done in other ways, and that swapping it out may be done as long as the same standard service is deployed. An email service was also necessary, to receive and send mail generated by systems automation as well as service information from the DSpace archive. We chose the Debian standard as the mail transfer agent, Exim, although alternatives like Postfix would be just as adequate. For access, Dovecot IMAP was used. The author has considerable experience of Dovecot and its maildir implementation proves to be robust and convenient. A deployment server provides web sites for the Archive, allowing groups to set up projects to display and interpret objects from the Archive, and providing the main point of access to the Archive services themselves. In addition, the web services can be used to document the installation, using Dokuwiki or similar. The DSpace software, along with its Postgresql database runs on another virtual server. This combination has allowed for remarkable dynamism to cope with changed circumstances or to fine tune the installation. For example, the Archive data store (assetstore in DSpace-speak) was originally on a separate virtual container, then moved to its own physical disk partition under KVM, and now is exposed as an NFS share to aid portability of the Archive virtual machine. In addition, the mail server has been used to set up email lists for other community projects, such as the Assynt Festival, as the community gets to understand what can be dome when it owns and runs its own technology infrastructure. One aspect of DSpace, for which vague plans have been punted, but not yet implemented, is the possibility of running a distributed archive. This can allow more local services across a geographic region while still retaining the ability to access and search the entire data set. In areas like Assynt, geographically dispersed, such an implementation may be attractive.