PZ Compute - Input Data QA Notebook - DES DR2 Sample of Galaxies¶
Collection of magnitude measurements for galaxies made available by DES survey, release DES DR2, obtained from the DB LIneA database.
Contact: Luigi Silva (luigi.lcsilva@linea.org.br)
Acknowledgments¶
If you use this dataset to generate scientific results, please acknowledge LIneA in the acknowledgments section of your publication. For instance:
'This research used computational resources from the Associação Laboratório Interinstitucional de e-Astronomia (LIneA) with the financial support of INCT do e-Universo (Process no. 465376/2014-2).'
Notes about the data¶
This notebook contains a characterization of the magnitudes measurements in the DES DR2 catalog, which has been publicly distributed and described in detail in scientific literature by the DES project. See the details by clicking on the links in the table below.
seq. | Survey name (link to the website) |
Number of objects in the original sample |
Reference (link to the paper) |
---|---|---|---|
1 | DES DR2 | ~691 million distinct astronomical objects | DES Collaboration 2021 |
In this notebook, we will use the data from the coadd_objects table of the DES DR2 catalog. This table has, originally, 215 columns (the name and meaning of each column can be found here).
We will use the following columns here:
Column | Meaning |
---|---|
COADD_OBJECT_ID | Unique identifier for the coadded objects |
RA | Right ascension, with quantized precision for indexing (ALPHAWIN_J2000 has full precision but not indexed) [degrees] |
DEC | Declination, with quantized precision for indexing (DELTAWIN_J2000 has full precision but not indexed) [degrees] |
MAG_AUTO_{G,R,I,Z,Y}_DERED | Dereddened magnitude estimation (using SFD98), for an elliptical model based on the Kron radius [mag] |
MAGERR_AUTO_{G,R,I,Z,Y} | Uncertainty in magnitude estimation, for an elliptical model based on the Kron radius [mag] |
FLAGS_{G,R,I,Z,Y} | Additive flag describing cautionary advice about source extraction process. Use less than 4 for well behaved objects |
EXTENDED_CLASS_COADD | 0: high confidence stars; 1: candidate stars; 2: mostly galaxies; 3: high confidence galaxies; -9: No data; Using Sextractor photometry |
The sextractor flags (FLAGS_{G,R,I,Z,Y}) are related to basic warnings about the source extraction process (see more here). If there are more than one warning, the flags are added together. The flags are showed below, in order of increasing concern.
Value | Meaning |
---|---|
1 | aperture photometry is likely to be biased by neighboring sources or by more than 10% of bad pixels in any aperture |
2 | the object has been deblended |
4 | at least one object pixel is saturated |
8 | the isophotal footprint of the detected object is truncated (too close to an image boundary) |
16 | at least one photometric aperture is incomplete or corrupted (hitting buffer or memory limits) |
32 | the isophotal footprint is incomplete or corrupted (hitting buffer or memory limits) |
64 | a memory overflow occurred during deblending |
128 | a memory overflow occurred during extraction |
Part 1 - Characterization of a random sample of data¶
We will apply the following filters in the subsequent selection of objects:
- EXTENDED_CLASS_COADD $\geq$ 2, that is, mostly galaxies and high confidence galaxies;
- MAG_AUTO_I_DERED $\leq$ 24.
Furthermore, we will make a random selection of the data, using the TABLESAMPLE SYSTEM in the query with 0.03 percentage, that is, we will get around 3% of the objects with the specified filter in a random way.
Check below a brief characterization of the data.
General settings for this notebook¶
Requirements for this notebook:
- General libraries: os, sys, numpy;
- Data access and manipulation libraries: pandas, dblinea;
- Astronomy: astropy;
- View libraries: bokeh, holoviews, geoviews, cartopy
- Auxiliary file: des-round19-poly.txt (contours of the area covered by the survey, i.e., DES footprint, 2019 version).
Download the file des-round19-poly.txt
from the repository kadrlica/skymap on GitHub:
--2024-10-16 17:02:46-- https://raw.githubusercontent.com/kadrlica/skymap/master/skymap/data/des-round19-poly.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.109.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 9947 (9.7K) [text/plain] Saving to: ‘des-round19-poly.txt’ des-round19-poly.tx 100%[===================>] 9.71K --.-KB/s in 0s 2024-10-16 17:02:46 (31.0 MB/s) - ‘des-round19-poly.txt’ saved [9947/9947]
Imports¶
Importing some scientific and visualization libraries.
Printing the versions of Python, Numpy, Bokeh and Holoviews:
Python version: 3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:38:13) [GCC 12.3.0] Numpy verstion: 1.26.4 Bokeh version: 3.4.2 HoloViews version: 1.19.1
Configs¶
Setting the number of rows that pandas will show:
Setting holoviews to work with bokeh:
Setting geoviews to work with bokeh:
Set bokeh plots to be inline:
Set matplotlib plots to be inline:
Reading and filtering the data¶
Read DES footprint file des-round19-poly.txt
¶
Here, we will read the DES DR2 footprint from the des-round19-poly.txt
file and print the minimum and maximum of R.A. and DEC.
R.A. AND DEC COORDINATES, BEFORE USING SKYCOORD R.A. min: -60.00 | R.A. max: 99.00 DEC min: -66.90 | DEC max: 5.00
After reading the footprint, we define the SkyCoord class from the Astropy library using the footprint R.A. and DEC coordinates. With SkyCoord, we have a flexible interface for representing, manipulating and transforming celestial coordinates between systems. We also use the unity module of Astropy; in u.degree
, for example, we indicate that the coordinates are in degree. Additionally, we use the wrap_at
method to ensure that the coordinates are in the range $[-180,180)$.
R.A. AND DEC COORDINATES, AFTER USING SKYCOORD R.A. min: -60.00 | R.A. max: 99.00 DEC min: -66.90 | DEC max: 5.00
Getting the DES DR2 data via dblinea¶
Defining the DBBase class, which makes the connection to the database.
Defining what data we want to access. Here, we will access the data from the coadd_objects table of the DES DR2 catalog.
Defining the parameters for the random function in the query (TABLESAMPLE SYSTEM).
Defining the filters as described in the introduction of this notebook (EXTENDED_CLASS_COADD $\geq$ 2 and MAG_AUTO_I_DERED $\leq$ 24).
Defining the query.
Doing the query and computing the elapsed time with the jupyter magic command time. It may take a while (about 3 minutes).
CPU times: user 1.61 s, sys: 373 ms, total: 1.98 s Wall time: 3.4 s
Computing and saving the colors (g-r), (r-i), (i-z) and (z-y).
Basic statistics¶
Below, we have basic data statistics for each column of the table.
coadd_object_id | ra | dec | mag_auto_g_dered | mag_auto_r_dered | mag_auto_i_dered | mag_auto_z_dered | mag_auto_y_dered | magerr_auto_g | magerr_auto_r | ... | flags_g | flags_r | flags_i | flags_z | flags_y | extended_class_coadd | mag_auto_(g-r)_dered | mag_auto_(r-i)_dered | mag_auto_(i-z)_dered | mag_auto_(z-y)_dered | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 9.998800e+04 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | ... | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 |
mean | 1.251899e+09 | 102.527895 | -33.877490 | 25.287078 | 23.256352 | 22.627857 | 22.488806 | 30.107333 | 2.132968 | 0.280633 | ... | 0.733088 | 0.731068 | 0.729248 | 0.730488 | 0.751410 | 2.747720 | 2.030726 | 0.628496 | 0.139051 | -7.618526 |
std | 2.151841e+08 | 119.488098 | 19.588251 | 10.150428 | 3.161639 | 1.276152 | 3.110938 | 23.318106 | 13.772016 | 3.822448 | ... | 1.226948 | 1.211881 | 1.203133 | 1.209414 | 1.326831 | 0.434323 | 9.873097 | 2.870491 | 2.820223 | 22.981943 |
min | 8.702647e+08 | 0.000243 | -67.432467 | 10.817312 | 10.896320 | 10.564870 | 9.920390 | 8.397012 | 0.000045 | 0.000067 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 2.000000 | -79.097788 | -3.155313 | -82.423227 | -83.293531 |
25% | 1.067842e+09 | 26.974658 | -49.867977 | 23.353343 | 22.588895 | 22.090337 | 21.756799 | 21.572318 | 0.082809 | 0.054819 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 2.000000 | 0.405236 | 0.253759 | 0.061256 | -0.519485 |
50% | 1.254279e+09 | 50.853578 | -36.859736 | 24.087546 | 23.448707 | 22.982669 | 22.642771 | 22.539330 | 0.145490 | 0.104935 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 3.000000 | 0.723007 | 0.463569 | 0.265093 | 0.034923 |
75% | 1.430351e+09 | 86.261900 | -19.824120 | 24.728833 | 24.018363 | 23.547288 | 23.286306 | 23.560544 | 0.252106 | 0.165683 | ... | 2.000000 | 2.000000 | 2.000000 | 2.000000 | 2.000000 | 3.000000 | 1.100445 | 0.734454 | 0.455501 | 0.362411 |
max | 1.700553e+09 | 359.999923 | 5.370579 | 99.000000 | 99.000000 | 24.000000 | 99.000000 | 99.000000 | 775.299316 | 178.056946 | ... | 19.000000 | 19.000000 | 19.000000 | 19.000000 | 19.000000 | 3.000000 | 80.933868 | 82.423227 | 5.788702 | 79.849766 |
8 rows × 23 columns
Plots¶
Spatial distribution with geoviews¶
Defining some general settings for the plot.
Aplying a coordinate transformation in R.A. and DEC.
Defining the longitudes and latitudes ticks.
Generating the labels plot, containing just the latitude and longitude ticks.
Building a geoviews Points element with R.A. and DEC. from the input sample. Here, we will multiply the R.A. coordinates by (-1) because it is a convention to invert the x axis in the graph. However, the ticks in the graph will be correctly shown. This is just a computational artifice.
Projecting the points in a Mollweide projection using geoviews and cartopy.
Applying datashader to the projected points.
Plotting the datashaded points, the DES footprint and the grid object, and computing the elapsed time with the jupyter magic command time. As you can see, the DES footprint R.A. coordinates were also multiplied by (-1), for the same reason explained before.
CPU times: user 6.65 s, sys: 113 ms, total: 6.76 s Wall time: 6.86 s
Magnitude distributions¶
Defining some general settings for the plot.
Defining some dictionaries to save the numpy histogram counts and bins, the holoviews histogram and the holoviews dimentions.
Doing the plots for each band and computing the elapsed time with the jupyter magic command time.
CPU times: user 721 ms, sys: 7.59 ms, total: 729 ms Wall time: 729 ms
Error distributions¶
Defining some general settings for the plot.
Defining some dictionaries to save the numpy histogram counts and bins, the holoviews histogram and the holoviews dimentions.
Doing the plots for each band and computing the elapsed time with the jupyter magic command time.
CPU times: user 718 ms, sys: 7.78 ms, total: 725 ms Wall time: 726 ms
Magnitude vs Error¶
Defining some general settings for the plot.
Defining some dictionaries to save the holoviews Points elements, the streams, the DynamicMap and the datashaded plots.
Doing the plots for each band and computing the elapsed time with the jupyter magic command time.
CPU times: user 4.82 s, sys: 223 ms, total: 5.05 s Wall time: 5.12 s
Color-Magnitude Diagrams¶
Defining some general settings for the plot.
Defining some dictionaries to save the holoviews Points elements, the streams, the DynamicMap and the datashaded plots.
Doing the plots for each band vs color and computing the elapsed time with the jupyter magic command time.
CPU times: user 4.68 s, sys: 185 ms, total: 4.87 s Wall time: 4.94 s
Color-Color Diagrams¶
Defining some general settings for the plot.
Defining some dictionaries to save the holoviews Points elements, the streams, the DynamicMap and the datashaded plots.
Doing the plots for each color vs color and computing the elapsed time with the jupyter magic command time.
CPU times: user 2.88 s, sys: 135 ms, total: 3.02 s Wall time: 3.06 s