Astro-AI Working Group

New Generation Astronomy Powered by Artificial Intelligence (July 1st 2017 - July 31st 2023)

Objectives

We have launched Astro-AI Working Group (WG) in 2017, and we will continue this WG as its second stage in 2021. Our primary goal was to apply artificial intelligence (AI; such as deep/machine learning) techniques to studies in the field of astrophysics, which is about to enter a new era of Big Data. In the last 4 years, many relevant papers were published from not only our group but also the other groups, and we realized that there are some topics in which we can maximize the power of AI (e.g., classification, treatment of Big Data, or extraction of latent parameters). Now we, as an entire community of astrophysics, need to look back at what we could and could not achieve by using AI. Our WG in the second phase is primarily aimed at holding regular workshops and seminars in order to share the previous results in which AI played important roles, discuss which specific themes can bring a big impact on the field of astrophysics, and prepare for the coming Big Data era.

1. Application of AI to data reduction of X-ray archival data

In the last decades, many X-ray satellites were launched and stored a bunch of data. For example, Chandra and XMM-Newton have been in operation for over 20 years and retrieved imaging and spectroscopic data. Gamma-ray observatory, Fermi, started its operation in 2008 and provided us with image, spectrum, and timing of the gamma-rays coming from all sky. More recently, time resolution was greatly improved by NICER, which was installed to the International Space Station (ISS) in 2017. Therefore, the accumulated data to date are very large and have rich statistics. Now we are faced with a big issue: how we can effectively analyze the existing Big Data and discover new science in advance of the next missions.

2. Application of AI to the Big Data in future

In the next decade, there are some prominent projects to be launched, taking a large amount of data by making use of new detection technologies. Among them, the SRG mission already started in 2019, and eROSITA onboard it is monitoring the X-ray all sky with a large collection area (i.e., Big Data in image). In the fiscal year of 2022, XRISM is planned to be launched and will provide us with unprecedentedly high-resolution energy spectra (i.e., Big Data in energy). NinjaSat, a small satellite mission (CubeSat) led by RIKEN, is designed to perform monitoring of bright sources (i.e., Big Data in time). It is concerned that it would be beyond the capacity of human beings to analyze this data brought with the future missions. Thus, we should construct new methods of analyses in advance.

3. Application of AI to experimental and theoretical data

Recent progress in detector technologies makes experimental data more complex and larger. AI is also powerful to extract essential information from them when reducing their large data size. Towards the future missions, such attempts are being performed (e.g., image analysis in X-ray polarimetry detectors and event reconstruction in MeV gamma-ray telescopes). We will review recent AI applications to experimental data and discuss ideas on the future missions, including new detector concepts. Furthermore, stimulated by the recent high-resolution data, theoretical studies also face data complexity, such as huge computation costs in high-resolution hydrodynamical simulations. AI is one of the promising ways to accelerate these simulations, which is also a key topic in this working group.

Facilitators:: Naomi Tsuji (RIKEN iTHEMS) *Contact at naomi.tsuji@riken.jp; Hiroki Yoneda (RIKEN Nishina Center); Hiroyoshi Iwasaki (Rikkyo Univ.); Masato Taki (Rikkyo Univ.)

Former Objectives (2017)

Observational astronomy has dramatically developed over the past several decades, achieving great understanding of the evolution of the Universe since the Big Bang. However, the present way of astronomical data analysis will no longer be very effective in the coming era of Big Data (not only because of the immense data volume, but also the accompanying complexity of data). For example, the Large Synoptic Survey Telescope (LSST) will serve to characterize billions of galaxies based on a 200 Petabyte set of optical images of the sky. Also, the Athena X-ray Observatory, to be launched in 2028, will provide thousands of X-ray spectra of individual supernova remnants with superb spectral resolution. Given the rapid expansion of the width and depth of astronomical data, it will become difficult to make use of large observatories across the entire electromagnetic spectrum to their full capacities. Theoretical astrophysics is facing similar problems; multi-dimensional numerical simulations using supercomputers produce a huge amount of data, making it a challenging task for humans to extract the hidden physics. To solve these problems and to enable efficient data mining, we propose a new direction for astronomical research by incorporating artificial intelligence (AI) techniques, such as deep learning. Below we provide some examples of possible research topics that we can envisage now, though we are open to any subjects in astronomy and astrophysics: 1) Observations of supernova remnants are crucial to understanding how stars evolve and explode, and how nuclear fusion takes place during both phases (i.e., stellar evolution and explosion), which in turn provides key to understand chemical evolution of the Universe. Significant inhomogeneity in physical properties (e.g., temperature, ionization degree, elemental abundances, etc.) of supernova remnants complicates data analysis. We will perform comprehensive analysis by utilizing various machine learning techniques (depending on purposes). For instance, spatially-resolved spectral analysis on individual objects can be efficiently performed with the so-called sparse Gaussian mixture model. Detailed comparisons between observed and numerically simulated images of supernova remnants help reveal mechanisms responsible for explosion, such as asymmetry. This should be performed through a deep learning method tailored for image recognition. 2) Recent gamma-ray survey by the Fermi Gamma-ray Space Telescope not only provided crucial data to understand the origin of cosmic rays but also revealed a large number of unidentified sources. We will classify these objects by applying deep learning to all the available data from multi-wavelength observations. 3) In the recent detections of gravitational waves, pattern matching with predictions from general relativity has played an important role. On the other hand, in the case of electromagnetic waves, various environmental conditions such as relativistic gas motion and strong radiation effects have to be considered. Astronomers usually extract patterns from observation data and interpret them using current theories. Up to now, tasks like frequency analysis and temporal convolution have been performed intensively, but their interpretations are often prone to subjectivity. By means of machine learning, a search educated by a theoretical pattern or a pattern search without teacher is a promising way to extract meaningful information that astronomers have overlooked so far. We may obtain new information such as clues to understanding black hole’s spatiotemporal mystery and neutron star’s magnetic field shape. 4) Galaxy clusters, confined in the deep gravitational potential well that dark matter forms, are the largest structures in the Universe. We will systematically perform spatial and spectral analysis of the clusters at different redshifts using machine learning methods to reveal their detailed evolution history. 5) Supernova explosions and their remnants exhibit an extremely rich diversity thanks to their different circumstellar environments, nature of progenitor stars and possibly explosion mechanisms. Linking progenitor stars with their final explosive events is of utmost importance in understanding the last stages of massive star evolution, but progress has been hindered by the complexity of the problem. Machine learning will help recognize patterns from lots of simulation results invoking a matrix of different initial conditions. Matching them with observational characteristics will reveal the key factors that dictate the diversity seen.