Evaluating a NoSQL alternative for Chilean Virtual Observatory Services

Jonathan Antognini Mauricio Araya, Mauricio Solar, Camilo Valenzuela, Francisco Lira. 

Currently, the standards and protocols for data access in the Virtual Observatory architecture (DAL) are generally implemented with relational databases based on SQL. In particular, the Astronomical Data Query Language (ADQL), language used by IVOA to represent queries to VO services, was created to satisfy the different data access protocols, such as Simple Cone Search. ADQL is based in SQL92, and has extra functionality implemented using PgSphere. An emergent alternative to SQL are the so called NoSQL databases, which can be classified in several categories such as Column, Document, Key-Value, Graph, Object, etc.; each one recommended for different scenarios. Within their notable characteristics we can find: schema-free, easy replication support, simple API, Big Data, etc. The Chilean Virtual Observatory (ChiVO) is developing a functional prototype based on the IVOA architecture, with the following relevant factors: Performance, Scalability, Flexibility, Complexity, and Functionality. Currently, it's very difficult to compare these factors, due to a lack of alternatives. The objective of this paper is to compare NoSQL alternatives with SQL through the implementation of a Web API REST that satisfies ChiVO's needs: a SESAME-style name resolver for the data from ALMA. Therefore, we propose a test scenario by configuring a NoSQL database with data from different sources and evaluating the feasibility of creating a Simple Cone Search service and its performance. This comparison will allow to pave the way for the application of Big Data databases in the Virtual Observatory.

Exorcising the Ghost in the Machine: Synthetic Spectral Data Cubes for Assessing Big Data Algorithms

Mauricio Araya, Mauricio Solar, Diego Mardones, Teodoro Hochfärber

The size and quantity of the data that is being generated by large astronomical projects like ALMA, requires a paradigm change in astronomical data analysis. Complex data, such as highly sensitive spectroscopic data in the form of large data cubes, are not only difficult to manage, transfer and visualize, but they also turn unfeasible the use of traditional data analysis techniques and algorithms. Consequently, the attention have been placed on machine learning and artificial intelligence techniques, to develop approximate and adaptive methods for astronomical data analysis within a reasonable computational time https://australia....ralia/. Unfortunately, these techniques are usually sub optimal, stochastic and strongly dependent of the parameters, which could easily turn into "a ghost in the machine" for astronomers and practitioners. Therefore, a proper assessment of these methods is not only desirable but mandatory for trusting them in large-scale usage. The problem is that positively verifiable results are scarce in astronomy, and moreover, science using bleeding-edge instrumentation naturally lacks of reference values. We propose an Astronomical SYnthetic Data Observatory (ASYDO), a virtual service that generates synthetic spectroscopic data in the form of data cubes. The objective of the tool is not to produce accurate astrophysical simulations, but to generate a large number of labelled synthetic data, to assess advanced computing algorithms for astronomy and to develop novel Big Data algorithms. The synthetic data is generated using a set of spectral lines, template functions for spatial and spectral distributions, and simple models that produce reasonable synthetic observations. Emission lines are obtained automatically using IVOA's SLAP protocol (or from a relational database) and their spectral profiles correspond to distributions in the exponential family. The spatial distributions correspond to simple functions (e.g., 2D Gaussian), or to scalable template objects. The intensity, broadening and radial velocity of each line is given by very simple and naive physical models, yet ASYDO's generic implementation supports new user-made models, which potentially allows adding more realistic simulations. The resulting data cube is saved as a FITS file, also including all the tables and images used for generating the cube. We expect to implement ASYDO as a virtual observatory service in the near future.
  • Chilean Virtual Observatory services implementation for the ALMA public data

Jonathan Antognini, Mauricio Solar, Jorge Ibsen, Mauricio Araya, Lars Nyman, Diego Mardones, Camilo Valenzuela, Patricio Ramirez, Christopher Fernandez, Mario Garces

"The success of an observatory is usually measured by its impact in the scientific community, so a common objective is to provide transparent ways to access the generated data. The Chilean Virtual Observatory (ChiVO), started working in the implementation of a prototype, in collaboration with ALMA, considering the current needs of the Chilean astronomical community, in addition to the protocols and standards of IVOA, and the comparison of different existing data access toolkit services. Based on this efforts, a VO prototype was designed and implemented for the ALMA large scale of data."

  • Chilean Virtual Observatory and Integration with ALMA

Mauricio Solar, Walter Fariña, Diego Mardones, Jonathan Antognini, Karim Pichara, Neil Nagar, Victor Parada, Jorge Ibsen, Lars Nyman, José Marroquin

"The Virtual Observatories strive to interoperate, exchange data and share services as if it was only one big VO. In this work, the state of the art of VOs will be presented and summarized in a schematic diagram with the frequency range of the observed data that every VO publishes. Chile, currently a member of the IVOA, collaborates with the Atacama Large Millimeter/submillimeter Array (ALMA), to study and propose ways to adequate the data generated by ALMA to the different data model proposed by the IVOA."

  • Automatic detection and automatic classification of structures in astronomical images

Rodrigo Gregorio, Mauricio Solar, Diego Mardones, Karim Pichara, Ricardo Contreras, Victor Parada

"The study of the astronomical structures is important to the astronomical community because it can help to identify objects, which can be classified based on their internal structure or their relation to other objects. For this reason, it is developed an automated tool to analyze astronomical images into its components. Firstly, a 2D images is decomposed into different spatial scales based on wavelet transform. Then, it is implemented a detection algorithms to each spatial scale, such as Clumpfind, Gaussclump, or Dendrogram techniques. The goal is to build a new algorithm and tool that is available to the community and satisfies the requirements of the next Chilean Virtual Observatory (ChiVO)."