Data Science Seminars

how data is powering more open and collaborative forms of science

11 january 2021 - 12h15-14h00

Online - registration mandatory




The constant improvement of the technologies on which scientific research is based has opened up new opportunities for data sharing. Indeed, science is now able to rely on exchange speeds and volumes unprecedented in its history. Such transformations have contributed to making possible new forms of collaboration. Such intertwining of expertise can occur at different levels of the research data lifecycle, appearing at both the collection and analysis levels. For example, many research groups, like those at CERN, are now analysing huge data sets together across laboratories spread around the world. Other research groups are coordinating data collection for democratically selected studies, a new trend in psychology, for example. Moreover, such collaborations now expand beyond the scientific community to include citizens collecting and analysing data in increasingly participatory and open forms of science.

Through concrete examples drawn from their research, the speakers at this seminar will explore the many ways in which collaboration on data collection and analysis is expanding. As these collaborations come with many challenges, notably related to technical aspects, governance issues, or compliance with legal and ethical frameworks, the speakers will focus their presentation on how data sharing is organized and managed in their respective research projects, what are the pitfalls to avoid and the crucial questions to ask, but also what are the opportunities that these new practices bring for a more open science




Collaborative data practices at the ATLAS experiment of the CERN LHC

Anna Sfyrla, Faculty of Science, Department of nuclear and particle physics.

The ATLAS collaboration at the CERN LHC has approximately 5000 members and about 3000 scientific authors affiliated with 182 institutions in 38  countries. The experiment constitutes one of the largest collaborative efforts ever attempted in science. Close to 1000 physics and  performance papers have been published to peer-review papers since 2010. Several of them were related to the Higgs boson discovery, linked to the 2013 Nobel Prize in physics. How are the data produced, collected,  distributed  and analysed in such a collaboration? How do researchers get granted the right to analyse the data and be part of the ATLAS authorlist? What collaborative tools exist and at what extent is the analysis software shared? This talk will discuss these and other questions related to the ATLAS collaboration data practices.


Large-scale collaboration in the field of psychology: Collecting human data around the world

Evie Vergauwe, Faculty of Psychology and Science of Education, Psychology Section.

A variety of new initiatives have been adopted in the field of psychology in the last 5-10 years. One such initiative is the Psychological Science Accelerator (PSA), a globally distributed network of psychological science laboratories (> 500), representing over 70 countries on all six populated continents. The PSA coordinates data collection for democratically selected studies. Their mission is to accelerate the accumulation of reliable and generalizable evidence in psychological science, a challenge that cannot be adequately met by a single researcher or small team. Furthermore, the PSA explicitly mandates transparent, open, and reproducible research. Coordinating the distributed data collection with a more centralized analysis of the data requires the use of appropriate platforms. Moreover, collecting data that concerns human behavior in this type of international collaboration comes with specific challenges related to ethical and legal issues. Challenges and solutions will be discussed.


Crowd4SDG : How citizen science can impact the gathering and analysis of data for the UN Sustainable Development Goals

François Grey, Centre Universitaire d'Informatique. 

Over the last couple of decades, citizen science – a set of methodologies that enables useful participation of amateurs in real research – has been adopted by an increasingly wide range of scientific disciplines, powered by the ability of ordinary citizens to use digital and mobile technologies to gather, analyse and even compute data. In 2015, the United Nations launched the 17 Sustainable Development Goals (SDGs), a framework for tackling the planet’s greatest environmental and societal challenges. Behind the 17 goals lie 169 targets and 231 indicators, many of which require governments to gather large amounts of quantitative data. Citizen science holds the promise of involving citizens in this process. But to do so requires ensuring that National Statistical Offices are aware of citizen science methodologies and can validate the quality of the resulting data. There is also a need to innovate in the tools and processes used for citizen science, in order to ensure that low-cost solutions are available for monitoring environmental and social indicators, even in resource-poor regions. Finally, there is the overarching challenge of enabling citizens to improve their communities in ways that will contribute to achieving the SDGs, an area where citizen science can also play a vital role. I will illustrate some of the opportunities and discuss some of the potential pitfalls of involving citizens in research for sustainable development, based on a new EC project that the University of Geneva is leading, called Crowd4SDG, in partnership with the UN Institute for Training and Research, CERN and several other partners.