Our second post to mark Love Data Week 2018 is a contribution by Anne-Marie Weijmans and Vivienne Wild from the School of Physics and Astronomy. This is their data journey as astronomers:
As astronomers we face an interesting challenge: the objects that we study (stars, planets, galaxies) are with only a few exceptions literally light years away. Paying a visit to any star beyond our Sun, let alone a nearby galaxy, is out of the question. So we have to rely on telescopes to collect the data that we need for our research, to better understand the Universe we live in.
Luckily, astronomers have a tradition of working together in large international collaborations to build the telescopes and collect the data that we need. The galaxies research group in St Andrews has three members (Drs Weijmans, Tojeiro and Wild) who are part of the Fourth Generation of Sloan Digital Sky Surveys (SDSS-IV): an astronomy survey that uses a 2.5-m telescope at Apache Point Observatory in New Mexico, US, to map the sky. In our group we use the data from this survey to learn more about how galaxies form, how they evolve, and how they can tell us more about the structure formation in the Universe. The data is collected by professional telescope operators at Apache Point, transferred to the SDSS-IV clusters at the Center for High Performance Computing at the University of Utah, where it is reduced by dedicated data reduction pipelines, and then stored on a central server.
SDSS-IV data is proprietary for about a year, and is then made publicly available through annual data releases. These public data releases are accompanied by extensive documentation, which not only makes the data available, but also accessible. SDSS-IV actively encourages people to use their data, by offering educational material through their Voyagers website, and by working together with citizen science projects such as Galaxy Zoo. This tradition of having free and easily accessible data was started in 2003 with the very first data release of SDSS-I, while the most recent 14th data release in 2017, makes more than 325 Terabyte (TB) of data available to professional astronomers and the public alike. This tradition of open data has really paid off: more than 7000 papers have been published using SDSS data; more than 30% of the US community of astronomers report using SDSS data; and it has become a reliable source of information for teachers and the general public.
We would not be able to do our research without access to the data that the telescope of the Sloan Digital Sky Survey provides: telescope operations and data handling are complex processes, but with a large collaboration they become manageable. But also, being able to share the data freely with collaborators after its proprietary period, means opportunities for exciting new projects! We have been working with artist Tim Fitzpatrick in the Shine project, where we use science, art and music to explore the properties of light. Inspired by the SDSS data that we use in our research, Tim has created pieces of art that explore how light is unraveled in spectra, and what information we can extract from the data.
Open Data is essential for modern day astrophysics, not just for science and educational purposes, but also to allow the data to reach new people who will provide new insights and uses, and perhaps create something beautiful like Shine has done!
More information on SDSS public data releases can be found in Weijmans et al., 2016, ‘The challenges of a public data release’, https://arxiv.org/abs/1612.05668
Posted on behalf of