Friday, December 14, 2012

Free software at the CERN, or: how did FOSS help the discovery of Higgs boson

What are the main fields in which FOSS is really important, and had given big successes in these years? Research is, probably, one of the most important. And when you talk about research, scientific research, you must speak about the CERN.
In the GNU/Linux Magazine Italy number of January, you will find an interview I did to two researchers about this topic: Sebastien Ponce is the head of CASTOR, the CERN's data storage system (based on GNU/Linux) and Brian Bockelman is an american physicist that works with CERN software to analyse data from ATLAS and CMS experiments (those who found the existence of Higgs particle).
I think you will find their answers really interesting.

Se leggete GNU/Linux Magazine Italia troverete nel numero di gennaio 2013 (in edicola giĆ  la prossima settimana) la traduzione in italiano dell'intera intervista, con una breve introduzione.

I would like to thank both Brian and Sebastien, for the time they spent answering me, but also Melissa Gaillard from the CERN press office for all the work she did.

Among other things, at the end of this page, there is an off-topic image.

Here's is the interview:

-First of all, please tell us something about you... 
Sebastien Ponce: I am a computer scientist, specialized in development of large software frameworks for data analysis and in mass storage. I am currently the head of the development of CERN's mass storage's system (CASTOR) that is storing and managing most of the data produced at CERN (85 PB right now, growing at a rate of 1PB/week). I started using free open source software at school in the 90s with debian 2.0 and am mainly using FOSS since then (still under debian). All the software I've been writing in the last 15 years is free and open source, be it for CERN or outside. 
Brian Bockelman: I work for the University of Nebraska-Lincoln as a postdoc and work for the Open Science Grid (OSG) and the Compact Muon Solenoid, one of the experiments on the Large Hadron Collider (LHC) at CERN. At Nebraska, we have one of the Tier-2 computing sites dedicated to CMS, this got me started with large scale storage systems as a graduate student. I've been using Linux since the late 90's - I've played with several different distributions, but almost always end up on either Fedora or RedHat. 

 -We know that at the CERN you use GNU/Linux systems. Do you use a common distribution (Debian, Fedora,...) or have you developed a your own distro? 
Sebastien Ponce: CERN is indeed using mainly GNU/Linux systems for data storage and analysis. A dedicated distribution has been developed together with FermiLab and various other labs and universities around the world. It is called Scientific Linux or SL and its primary purpose is to reduce duplicated effort of the labs participating in the CERN's related projects, and to have a common install base for the various experimenters. The base SL distribution is Enterprise Linux, recompiled from source. 
Brian Bockelman: At Nebraska, we also use Scientific Linux 6.3. The CMS experiment has about 50 computing sites throughout the world and all use some variant of RedHat Enterprise Linux - be it RHEL licenses, CentOS, ROCKS, or Scientific Linux.

-Why did you choose Free Open Source Software? Do you also use some proprietary tools? 
Sebastien Ponce: Open access in general has a strong tradition at CERN. In 1953, the Convention for the establishment of CERN already stated : "the results of its experimental and theoretical work shall be published or otherwise made generally available". This still holds today in all fields where CERN is involved including computer science. For example, it's worth mentioning two initiatives outside software in this domain: the open hardware initiative with its Open hardware license and the Sponsoring Consortium for Open Access Publishing in Particle Physics. So when talking about software, CERN is naturally releasing its code under free open source licenses and makes heavy use of free open source software. Having said that, we also use non free open source software, and proprietary tools in specific domains like civil engineering, databases or tape storage.

-How does the distribution of data works? (Or: Can you explain briefly Castor and the "tiers" mechanism?) 
Brian Bockelman: We primarily have two types of data: simulated event data, and data read from the detector. The detector data starts off underground at CERN, where it is transferred to the CERN data center. A first quick transformation of the data (called "reconstruction") is done there, written to tape in Castor, and transferred to multiple "Tier-1" sites around the world. At the Tier-1 sites, a second copy is written to tape and another processing pass is done. The data, now in a format appropriate for physicists to use in their work, is copied to one of forty CMS Tier-2 sites like Nebraska. Compared to the Tier-1s, which are fully utilized by centrally-planned processing, the Tier-2s are somewhat chaotic. Each user will be analyzing a different set of data with a different application; the analysis may have to work at ten different sites to complete. The distributed nature of our work has led to quite a bit of middleware development - we have to distribute petabytes of data to multiple sites and make it so the user has transparent access to all of them.

-What is Apache Hadoop, and why do you use it? 
Brian Bockelman: CERN writes its data in Castor and the Tier-1 sites write data into various niche systems which integrate directly with tape. However, at the Tier-2s, we have more flexibility in selecting the storage system. A storage system needs to be able to stitch together multiple disk servers (there's about 2.5 petabytes of raw disk at Nebraska), provide a uniform namespace, be quite reliable, and have sufficiently high-performance. We need to be able to layer grid components for cross-site data access. Finally, this all needs to be easy to administer: the whole of our Tier-2 has two sysadmins, and we cannot afford to dedicate one solely to filesystem issues. In late 2008, I began examining the Hadoop Distributed File System (HDFS) from the Apache Software Foundation. HDFS is implemented in Java and shares many design characteristics with the Google File System. The services run in userspace and layer on top of the disk server's filesystem. There are two basic node types, the namenode and the datanode (there is a third, the secondary namenode, which can be ignored for now). The namenode manages the namespace and orchestrates access to data for user processes; the namespace metadata is kept in-memory, allowing for incredibly fast read-access. It also breaks the files into fixed-size blocks (we use 128MB) and selects the datanode to host each block. It keeps track of each block's location and makes sure there is a sufficient number of replicas (we require two replicas for each block). Because of the replication, HDFS is a breeze to manage. The death of an entire datanode is not a critical event: the namenode will simply create another replica of all the data it held. In fact, we don't even bother returning dead hard drives until we have at least a box full of dead ones. The only host we have to keep an close eye on is the namenode; it's a lot easier to have only one critical system compared to dozens. While other systems may have similar or better performance characteristics, the reliability and ease-of-management is really why we selected HDFS in the end. It's also quite comforting to know that we won't ever be the largest user of HDFS; it's nice to have someone else work out scale issues. We converted the site to HDFS in early 2009; since our initial success, six other Tier-2 sites in the US have also converted. We have been happily using HDFS since then, but I try to keep an eye on the alternates. The systems which are closest to HDFS are Ceph and Gluster, but we're not planning a switch anytime soon. 

-In particular, how the analysis of ATLAS and CMS data have been done? 
Brian Bockelman: Readers familiar with Hadoop notice that I only described the filesystem, but skipped the distribution of the Map/Reduce component. This is because we actually don't use Map/Reduce: the resources CMS uses overlaps with those used by other LHC experiments on the grid. The entire grid is about 140 computing sites; getting everyone to agree on a single technology implementation is nearly impossible. Instead, we make sure we have interoperability between sites' technology using platforms from organizations like the Open Science Grid or the European Middleware Initiative. Our analyses are based on breaking large tasks into many batch system jobs which are distributed about the planet; they read data directly from the sites' storage system. Conceptually, it's close to Map/Reduce: we are often either transforming data to different formats and filtering out data not relevant to the individual's work. This is done in several passes, each one having smaller, more-specialized data than the last. Finally, we end up with something which the physicist can move off the grid and onto his laptop for some final work using techniques like neural networks and multi-variate analysis. 

-Is there something you think is still missing in FOSS for research (physic research)? 
Brian Bockelman: As mentioned by Sebastien, the field of High Energy Physics (HEP) has a long history of collaboration, leading to its natural fit with FOSS. Accordingly, basically all the tools used in research are open source. It adds up to a huge amount of software - the core physics software of CMS alone amounts to about 6 million lines of code. Nothing strikes me as "missing", but there are always things we could do better. I work closely with sysadmins running university clusters, so I always hear their pain: - Package management on RHEL still has rough spots. It's nearly impossible to roll-back from a significant upgrade; in the end, we typically just have to scratch the host and let Cobbler/Puppet rebuild it. We also have poor tools for userspace installs: a user often has to resort to compiling tens of software packages from source if they want to install something in their home directory without sysadmin help, yet they may not want the responsibilities of running their own VM (and maintaining a safe, fully-patched configuration). The amount of work necessary for a non-admin user to install their favorite python module for personal use is striking when compared to how easy it is for the sysadmin. - We tend to configure software a single piece at a time, yet complex services may involve a dozen pieces of software and scattered system configuration. There don't seem to be many "best-practices" available for configuring many pieces of software that don't involve editing dozens of configuration files. The progress made since I've started is indeed striking; for example, I'm incredibly excited to roll out systemd at work. I've had enough of poorly-written init scripts for my lifetime. 

-What do you think about the future of computer aided analysis, and which role FOSS can have in it? 
Sebastien Ponce: Computer aided analysis has imposed itself as the main tool for scientific researchers and I believe it will stay so for long. FOSS has an essential role to play there as it is the most efficient approach to software for this community, that is spread, heterogeneous and where users have a high level of education. As CERN has proven in the last 58 years, collaborating is the key to success in such a context, and FOSS is making it happen in the computer science field. So I see FOSS as a key player of computer aided analysis in the future (and already now), that may structure all the software in the domain.

-Can you give us your personal opinion about the use of FOSS in university and research institutes?
Brian Bockelman:
I personally think that the use of open source software (and free open source software in particular) in universities and research institutes is fundamental and I see two main reasons for this.
First because of the nature of research institutes and universities : they usually have a small budget, but very skilled and talented people. This usually makes the payment of licenses associated with non free software an heavy constraint while the skills needed to support and maintain open source software are often available.
So free open source software matches well here, next to non free open source software when support and/or maintenance are outsourced. The second reason is that open source software in general (free or not) allows university and research communities to build extremely large and heterogeneous systems (like the LHC computing grid(5)). It would be extremely difficult to agree on a unique set of proprietary tools for such systems across 100s of laboratories with different constraints and strategies. With open source software, one can adapt more easily existing tools so that different solutions cooperate. This was a key ingredient in the success of the grid.

-Is there something you would like to add for our readers? 
Sebastien Ponce: If you are interested in knowing more about CERN and our research there (including in computer science), have a look at our web site ( or come and visit us in Geneva. It's free and our computer center can be part of the tour. You may also be interested in CERN's job offers, there are all available on our web site.

Ok this is, basically, the interview. Before concluding this post, even if it's off topic, I would like to thank Jonathan Riddell for sending me a Kubuntu polo shirt. I received it just some days ago, and I wanted to wear it immediately. Even if in those days it snowed, and it was quite cold (about 4°C). Worth it:


  1. I miss a statement about the ROOT framework. IMHO it is one of the worst software code which ever came out of CERN. It influenced generations of physics students to write bad "C++" code and is generally just poor software.

    I have decided for myself that I will never touch anything from CERN anymore, as long as they don't push an official statement out, that their ROOT is the worst software ever.

  2. Its nice to see this being mentioned on planetkde. The praise should not only be for CERN, though. Many institutions and projects around the world have contributed to this success. Otherwise, it wouldn't have been possible. Look at all the FP6 and FP7 projects for example.