Machine learning for particle physics using R

Search strategies for new subatomic particles often depend on being able to efficiently discriminate between signal and background processes. Particle physics experiments are expensive, the competition between rival experiments is intense, and the stakes are high. This has lead to increased interest in advanced statistical methods to extend the discovery reach of experiments. This talk will present a walk-through of the development of a prototype machine learning classifier for differentiating between decays of quarks and gluons at experiments like those at the Large Hadron Collider at CERN. The power to discriminate between these two types of particle would have a huge impact on many searches for new physics at CERN and beyond. I will discuss why I chose to perform this analysis in R, how switching to R has helped my work and enabled me to adopt a more efficient reproducible research workflow, and how I have overcome the problems that I encountered when working with large datasets in R.

Andrew John Lowe
Scientific Research Fellow, Wigner Research Centre for Physics

Andrew Lowe is a particle physicist at the Wigner Research Centre for Physics, Hungarian Academy of Sciences, in Budapest. He spent several years based at the European Organization for Nuclear Research (CERN) in Geneva and was a member of the collaboration that discovered the Higgs boson. He played a major role in the development of the core software and algorithms for a real-time multi-stage cascade classifier that filters and reduces the collision event data rate from 60 TB/s to a manageable 300 MB/s that can be written to permanent storage for subsequent offline analysis. He now works on using machine learning techniques to develop classification algorithms for recognising particles based on their decay properties.