This article was originally published in Finnish in MikroPC magazine, issue 14/1999, 1.10.1999.

Finnish version

CERN replaces supercomputers by Linux-clusters

Linux-PC:s are replacing supercomputers in scientific computation applications. At CERN, world's largest particle physics research center, people are already preparing to set up thousands of pc:s as Linux clusters.

The scientific work at CERN, which is situated in Geneva, Switzerland, is based on collision experiments done using particle accelerators. The goal is to study the deepest structure of matter and the events at the time when the universe was born, the Big Bang. The research sets high demands for the computing facilities, as a very large amount of data is produced.

1,2 petabytes of information

The data flow from experiment sites is 30 megabytes a second, so a typical 6 GB hard disk of a PC would be full in about three minutes. The information is stored in a 1,2 petabyte central tape storage that is at the moment filled at a rate of 200 terabytes a year. The hard disk space for temporary storage and processing stands for 20 terabytes itself.
Tape silos Each of the four towers contains 6000 slots for 50 GB tapes. If the tape readers in one of the silos are full, the intelligent robots are capable to pass the tapes to each other through holes in the walls.

The actual location of the bits doesn't matter to the physisists using the system. They send commands like "get the August data from the Delphi experiment" and the files get collected from the tapes to disk servers for faster access. Processing jobs are sent to a central queue and the results can be read back when the computation job has finished.

The majority of the thousand processors in the computing centre are in fact dedicated purely for number crunching, the total capacity is about 200 billion operations per second.

No backup

The key element for moving data around is the home made RFIO-system (Remote File I/O) that gives the possibility to open a fast file transfer connection between any two servers. The equipment are grouped into logical clusters, attempting to minimize the number of expensive top-level switches. Usually 20-30 machines are assigned to a single computing task. If needed, the roles of the machines and the structure of the system can easily be modified.

Only one copy of the raw data from experiments is stored, the resources don't allow backups. About 1/1000 of the data is lost yearly due to tape quality problems, according to Bernd Panzer, responsible for the Central Data recording service. But that doesn't do too much harm, the experiments lose a lot more data due to problems on their own side. The completed results are naturally taken good care of.

Linux requires local experts

CERN has already almost given up using mainframes and supercomputers, that have traditionally been used in scientific computing. Nowadays the biggest single machine is a 28 processor Silicon Graphics Origin 2000 - in Finland the CSC's (Center for Scientific Computing) similar 128 processor model and older CRAY T3E leave it easily behind.

"The key parts of the hardware are now formed by middlerange Unix-servers, but the direction is clearly towards PC hardware", says Data Management Section leader Harry Renshall.

There are already about 200 Linux and 50 NT-machines in the main computing hall. They require a new type of approach towards problems. The programs must be well distributable, because the communication between processors is slow compared to big SMP-machines.
A Linux cluster Beowulf cluster software is a relatively lightweight addition to the base Linux system. The machines are interconnected using fast, at least 100 Mb Ethernet.

Windows NT's advantage are better object orientated software development tools, but Linux is otherwise easier to migrate to, as the old software is Unix-based. It is also easier to manage remotely.

"The stability and functionality of Linux is now near commercial Unix systems, a year ago the situation was still very different", Renshall says.

"The big Unix vendors offer expensive, but high quality support. Windows NT and Linux instead are on the same line when considering our specialized use here at CERN - we seek for help from our local experts and Internet newsgroups", he continues.

Due to the open nature of Linux, hardware drivers can also be developed in-house, if needed. CERN is indeed known as a developer of drivers for high-speed networking cards.

10 000 PCs

Both NT- and Linux setups have been succesful, but new purchases are almost solely Linux PC:s. Their price of purchase is about one tenth of that of supercomputers and one third of proprietary Unix solutions. At the moment, the usage of Linux is being tried to be extended also to disk and tape servers, in which the biggest savings can be achieved.

"The higher administration costs increase the total cost of ownership, but Linux PC:s are still the most economical solution for us", Renshall says.

The new LHC-accelerator (Large Hadron Collider), due to be finished year 2005, will require four petabytes of yearly storage capacity and enormous computing power.

The projects counts on that the prices will drop and performance will be increased continuously. In the plans of the IT people the whole big main hall of the computing centre will be filled by 5000 - 10000 PCs and disk and tape storage systems. The biggest concern is the price of tape storage, it doesn't get cheaper as fast as the other components.

Back to my writings index

Copyright Arto Teräs <> 1999.
Redistribution of this entire article is permitted in any medium as long as this copyright notice is preserved.

Last update 3.7.2000