Introduction


I have been fascinated by clusters since 1995, when DEC VMS clusters where state-of-the-art and first Windows clusters emerged for simple storage sharing. Almost a decade later, cluster technology becomes mainstream through Linux, with falling hardware prices making it affordable for "home" use. What could I use a cluster for? Finally start my education in 3D Graphics Rendering and Computer Art Design? Donate power to SETI? Start a new research into the possibilities of crunching passwords?

Linux Cluser
Pictured left: diskless Lubics Linux Cluster, built out with 3x AMD 2600 CPU's: Dwarf-1, 2 and 3

The budget had top priority, the cost had to be kept down and soon a design emerged into reality: A Lubics frame system would house up to eight motherboard / CPU combo's, stacked up in shelfs of two systems. AMD CPU systems were choosen for the best price/performance value. The cluster members would be diskless and boot their OS from a cluster controller over the network. The cluster controller, currently my old Sparc Classic, is planned to be one of the eight local boards: Say 'Hi' to "SnowWhite and the Seven Dwarfs".

Planned to be diskless systems, all I needed to add was very small power supplies and RAM memory. To ensure reliable uptime, I invested into smaller but high quality XMS memory sticks from Corsair. The frame planning and building turned out to be a lot of fun, these Lubic construction sets bring back memories of Lego and make childhood dreams come true. A local plastics shop cut the acrylic shelfs to size, it proved to be a perfect fit. Once everything was installed, the operating system and the boot process became my main concern. Most modern PC motherboards provide a network boot option via PXE in their BIOS, which I carefully researched beforehand by reading the Motherboard specs and manuals. PXE stands for Preboot EXecution Environment and is a DHCP extension to request a network bootstrap program via TFTP. I am using PXELinux from Peter Anvin. In the configuration, a kernel and its boot options (i.e. a ramdisk image) can be tied to the specific MAC addresses of the cluster members, allowing a fine-grained control and flexibility over who gets which configuration. For example, all cluster members can either run the same software and OS or the cluster can be split to work on different tasks in parallel. This flexibility also allows to integrate different hardware into the cluster.

Linux Cluster Boot 1
Pictured left: A packet trace of the PXE client boot process, click picture to see full output (1024x768, 48KB).

After the boot loader had been verified to work, I configured a dedicated, efficient minimalistic kernel by leaving out all unneeded hardware options (disks, graphics, etc) and I decided to compile the device drivers and options directly into the kernel, rather then modularize them. That further simplified the setup and the kernel image was still of manageable ~900KB size. I used the 2.4 Linux kernel sources from my SuSe 9.0 installation and also played shortly with 2.6.

Next step was building a small root filesystem image to fit in a compressed ramdisk. Busybox is a embedded Linux solution and the perfect tool providing a mini-root with the most important binaries in one set. It was so small, I could even add all my wanted programs and still size the ramdisk for 64MB, were I am only at 20% usage. 64MB still sounds big for loading it over the network, but the beauty in Linux is that the ramdisk can be compressed with gzip and is decompressed on the fly at load time. 64MB ramdisk at 20% usage compressed to a tiny image of only 4.4MB.

Linux Cluster Boot 2
Pictured left: the cluster boot screen, DHCP and NFS mount and login on dwarf-1

Adding more standard Linux programs to the mini-root however was not without its challenges. I.e. I wanted to have the "man" command available and ended up adding quite a number of libraries and programs responsible for various aspects of the modern "man" command such as flexible screen formatting (groff and troff packages) or manpage compression (zlib). The other option would have been mounting the rest of the OS via NFS and keep the ramdisk even smaller. I decided to keep the OS in ramdisk to make the clients more independend of a possible short server outage (server ugrade, etc). I still mount a NFS directory on each cluster client, but that is purely for information sharing and as a place to deposit computing results.

I tested the cluster performance running the RSA-72 client from distributed.net, achieving 133 work units/day per client, almost reaching 400 WU/day with the inital 3 cluster members dwarf1-3. It catapulted me into the top 5000 crunchers. I had to back out from the slight overclocking I was practising, as 2 of my clients began to lock up and needed reboots after several days. In the end, it is probably cheaper (and quicker) to add cluster members then to overclock. So what are my next plans? I wan't to try out the OpenMosix cluster extension for transparent process distribution among cluster members, and look into distributed 3D graphics rendering.

Local How-2's and Configurations


System Photos


Linux Cluster 2 Linux Cluster 3 Linux Cluster 4 Linux Cluster 5

External Links


Credits and Copyrights: