PZ Compute - Input Data QA Notebook - DES DR2 Sample of Galaxies¶
Collection of magnitude measurements for galaxies made available by DES survey, release DES DR2, obtained from the DB LIneA database.
Contact: Luigi Silva (luigi.lcsilva@linea.org.br)
Acknowledgments¶
If you use this dataset to generate scientific results, please acknowledge LIneA in the acknowledgments section of your publication. For instance:
'This research used computational resources from the Associação Laboratório Interinstitucional de e-Astronomia (LIneA) with the financial support of INCT do e-Universo (Process no. 465376/2014-2).'
Notes about the data¶
This notebook contains a characterization of the magnitudes measurements in the DES DR2 catalog, which has been publicly distributed and described in detail in scientific literature by the DES project. See the details by clicking on the links in the table below.
seq. | Survey name (link to the website) |
Number of objects in the original sample |
Reference (link to the paper) |
---|---|---|---|
1 | DES DR2 | ~691 million distinct astronomical objects | DES Collaboration 2021 |
In this notebook, we will use the data from the coadd_objects table of the DES DR2 catalog. This table has, originally, 215 columns (the name and meaning of each column can be found here).
We will use the following columns here:
Column | Meaning |
---|---|
COADD_OBJECT_ID | Unique identifier for the coadded objects |
RA | Right ascension, with quantized precision for indexing (ALPHAWIN_J2000 has full precision but not indexed) [degrees] |
DEC | Declination, with quantized precision for indexing (DELTAWIN_J2000 has full precision but not indexed) [degrees] |
MAG_AUTO_{G,R,I,Z,Y}_DERED | Dereddened magnitude estimation (using SFD98), for an elliptical model based on the Kron radius [mag] |
MAGERR_AUTO_{G,R,I,Z,Y} | Uncertainty in magnitude estimation, for an elliptical model based on the Kron radius [mag] |
FLAGS_{G,R,I,Z,Y} | Additive flag describing cautionary advice about source extraction process. Use less than 4 for well behaved objects |
EXTENDED_CLASS_COADD | 0: high confidence stars; 1: candidate stars; 2: mostly galaxies; 3: high confidence galaxies; -9: No data; Using Sextractor photometry |
The sextractor flags (FLAGS_{G,R,I,Z,Y}) are related to basic warnings about the source extraction process (see more here). If there are more than one warning, the flags are added together. The flags are showed below, in order of increasing concern.
Value | Meaning |
---|---|
1 | aperture photometry is likely to be biased by neighboring sources or by more than 10% of bad pixels in any aperture |
2 | the object has been deblended |
4 | at least one object pixel is saturated |
8 | the isophotal footprint of the detected object is truncated (too close to an image boundary) |
16 | at least one photometric aperture is incomplete or corrupted (hitting buffer or memory limits) |
32 | the isophotal footprint is incomplete or corrupted (hitting buffer or memory limits) |
64 | a memory overflow occurred during deblending |
128 | a memory overflow occurred during extraction |
Part 1 - Characterization of a random sample of data¶
We will apply the following filters in the subsequent selection of objects:
- EXTENDED_CLASS_COADD $\geq$ 2, that is, mostly galaxies and high confidence galaxies;
- MAG_AUTO_I_DERED $\leq$ 24.
Furthermore, we will make a random selection of the data, using the TABLESAMPLE SYSTEM in the query with 0.03 percentage, that is, we will get around 3% of the objects with the specified filter in a random way.
Check below a brief characterization of the data.
General settings for this notebook¶
Requirements for this notebook:
- General libraries: os, sys, numpy;
- Data access and manipulation libraries: pandas, dblinea;
- Astronomy: astropy;
- View libraries: bokeh, holoviews, geoviews, cartopy
- Auxiliary file: des-round19-poly.txt (contours of the area covered by the survey, i.e., DES footprint, 2019 version).
Download the file des-round19-poly.txt
from the repository kadrlica/skymap on GitHub:
! wget https://raw.githubusercontent.com/kadrlica/skymap/master/skymap/data/des-round19-poly.txt -O des-round19-poly.txt
--2024-10-16 17:02:46-- https://raw.githubusercontent.com/kadrlica/skymap/master/skymap/data/des-round19-poly.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.110.133, 185.199.109.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 9947 (9.7K) [text/plain] Saving to: ‘des-round19-poly.txt’ des-round19-poly.tx 100%[===================>] 9.71K --.-KB/s in 0s 2024-10-16 17:02:46 (31.0 MB/s) - ‘des-round19-poly.txt’ saved [9947/9947]
Imports¶
Importing some scientific and visualization libraries.
### GENERAL
import os
import sys
import numpy as np
from IPython.display import IFrame
### DATA ACCESS AND MANIPULATION
import pandas as pd
from dblinea import DBBase
### ASTRONOMY
from astropy import units as u
from astropy.coordinates import SkyCoord
from astropy.units.quantity import Quantity
### VISUALIZATION
# Bokeh
import bokeh
from bokeh.io import output_notebook
# Holoviews
import holoviews as hv
from holoviews import streams, opts
from holoviews.operation import histogram
from holoviews.operation.datashader import rasterize, dynspread
# Geoviews
import geoviews as gv
import geoviews.feature as gf
# Cartopy
from cartopy import crs
Printing the versions of Python, Numpy, Bokeh and Holoviews:
print('Python version: ' + sys.version)
print('Numpy verstion: ' + np.__version__)
print('Bokeh version: ' + bokeh.__version__)
print('HoloViews version: ' + hv.__version__)
Python version: 3.12.3 | packaged by conda-forge | (main, Apr 15 2024, 18:38:13) [GCC 12.3.0] Numpy verstion: 1.26.4 Bokeh version: 3.4.2 HoloViews version: 1.19.1
Configs¶
Setting the number of rows that pandas will show:
pd.set_option('display.max_rows', 10)
Setting holoviews to work with bokeh:
hv.extension('bokeh')
Setting geoviews to work with bokeh:
gv.extension('bokeh')
Set bokeh plots to be inline:
Set matplotlib plots to be inline:
%matplotlib inline
Reading and filtering the data¶
Read DES footprint file des-round19-poly.txt
¶
Here, we will read the DES DR2 footprint from the des-round19-poly.txt
file and print the minimum and maximum of R.A. and DEC.
foot_ra, foot_dec = np.loadtxt('des-round19-poly.txt', unpack=True)
print("R.A. AND DEC COORDINATES, BEFORE USING SKYCOORD")
print(f"R.A. min: {foot_ra.min():.2f} | R.A. max: {foot_ra.max():.2f}")
print(f"DEC min: {foot_dec.min():.2f} | DEC max: {foot_dec.max():.2f}")
R.A. AND DEC COORDINATES, BEFORE USING SKYCOORD R.A. min: -60.00 | R.A. max: 99.00 DEC min: -66.90 | DEC max: 5.00
After reading the footprint, we define the SkyCoord class from the Astropy library using the footprint R.A. and DEC coordinates. With SkyCoord, we have a flexible interface for representing, manipulating and transforming celestial coordinates between systems. We also use the unity module of Astropy; in u.degree
, for example, we indicate that the coordinates are in degree. Additionally, we use the wrap_at
method to ensure that the coordinates are in the range $[-180,180)$.
foot_coords = SkyCoord(ra=foot_ra*u.degree, dec=foot_dec*u.degree, frame='icrs')
foot_df = pd.DataFrame({'foot_ra': np.array(foot_coords.ra.wrap_at(180*u.degree)),
'foot_dec': np.array(foot_coords.dec)})
print("R.A. AND DEC COORDINATES, AFTER USING SKYCOORD")
print(f"R.A. min: {foot_df['foot_ra'].min():.2f} | R.A. max: {foot_df['foot_ra'].max():.2f}")
print(f"DEC min: {foot_df['foot_dec'].min():.2f} | DEC max: {foot_df['foot_dec'].max():.2f}")
R.A. AND DEC COORDINATES, AFTER USING SKYCOORD R.A. min: -60.00 | R.A. max: 99.00 DEC min: -66.90 | DEC max: 5.00
Getting the DES DR2 data via dblinea¶
Defining the DBBase class, which makes the connection to the database.
db = DBBase()
Defining what data we want to access. Here, we will access the data from the coadd_objects table of the DES DR2 catalog.
schema = "des_dr2"
tablename = "main"
Defining the parameters for the random function in the query (TABLESAMPLE SYSTEM).
tablesample_percentual = 0.03 # Approximate percentual of the data that TABLESAMPLE SYSTEM will select randomly.
rand_seed = 100 # This seed is defined here to reproduce the same result every time the notebook is runned.
Defining the filters as described in the introduction of this notebook (EXTENDED_CLASS_COADD $\geq$ 2 and MAG_AUTO_I_DERED $\leq$ 24).
extended_class_coadd_lim = 2
mag_lim_i = 24
Defining the query.
query = (f"SELECT coadd_object_id, ra, dec, mag_auto_g_dered, mag_auto_r_dered, mag_auto_i_dered, mag_auto_z_dered, mag_auto_y_dered, magerr_auto_g, "+
f"magerr_auto_r, magerr_auto_i, magerr_auto_z, magerr_auto_y, flags_g, flags_r, flags_i, flags_z, flags_y, extended_class_coadd "+
f"FROM {schema}.{tablename} "+
f"TABLESAMPLE SYSTEM({tablesample_percentual:.2f}) REPEATABLE ({rand_seed}) "+
f"WHERE (extended_class_coadd >= {extended_class_coadd_lim} "+
f"AND mag_auto_i_dered <= {mag_lim_i}) "
)
Doing the query and computing the elapsed time with the jupyter magic command time. It may take a while (about 3 minutes).
%%time
df_input = db.fetchall_df(query)
CPU times: user 1.61 s, sys: 373 ms, total: 1.98 s Wall time: 3.4 s
Computing and saving the colors (g-r), (r-i), (i-z) and (z-y).
df_input['mag_auto_(g-r)_dered'] = df_input['mag_auto_g_dered'] - df_input['mag_auto_r_dered']
df_input['mag_auto_(r-i)_dered'] = df_input['mag_auto_r_dered'] - df_input['mag_auto_i_dered']
df_input['mag_auto_(i-z)_dered'] = df_input['mag_auto_i_dered'] - df_input['mag_auto_z_dered']
df_input['mag_auto_(z-y)_dered'] = df_input['mag_auto_z_dered'] - df_input['mag_auto_y_dered']
Basic statistics¶
Below, we have basic data statistics for each column of the table.
basic_stats = df_input.describe()
basic_stats
coadd_object_id | ra | dec | mag_auto_g_dered | mag_auto_r_dered | mag_auto_i_dered | mag_auto_z_dered | mag_auto_y_dered | magerr_auto_g | magerr_auto_r | ... | flags_g | flags_r | flags_i | flags_z | flags_y | extended_class_coadd | mag_auto_(g-r)_dered | mag_auto_(r-i)_dered | mag_auto_(i-z)_dered | mag_auto_(z-y)_dered | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
count | 9.998800e+04 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | ... | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 | 99988.000000 |
mean | 1.251899e+09 | 102.527895 | -33.877490 | 25.287078 | 23.256352 | 22.627857 | 22.488806 | 30.107333 | 2.132968 | 0.280633 | ... | 0.733088 | 0.731068 | 0.729248 | 0.730488 | 0.751410 | 2.747720 | 2.030726 | 0.628496 | 0.139051 | -7.618526 |
std | 2.151841e+08 | 119.488098 | 19.588251 | 10.150428 | 3.161639 | 1.276152 | 3.110938 | 23.318106 | 13.772016 | 3.822448 | ... | 1.226948 | 1.211881 | 1.203133 | 1.209414 | 1.326831 | 0.434323 | 9.873097 | 2.870491 | 2.820223 | 22.981943 |
min | 8.702647e+08 | 0.000243 | -67.432467 | 10.817312 | 10.896320 | 10.564870 | 9.920390 | 8.397012 | 0.000045 | 0.000067 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 2.000000 | -79.097788 | -3.155313 | -82.423227 | -83.293531 |
25% | 1.067842e+09 | 26.974658 | -49.867977 | 23.353343 | 22.588895 | 22.090337 | 21.756799 | 21.572318 | 0.082809 | 0.054819 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 2.000000 | 0.405236 | 0.253759 | 0.061256 | -0.519485 |
50% | 1.254279e+09 | 50.853578 | -36.859736 | 24.087546 | 23.448707 | 22.982669 | 22.642771 | 22.539330 | 0.145490 | 0.104935 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 3.000000 | 0.723007 | 0.463569 | 0.265093 | 0.034923 |
75% | 1.430351e+09 | 86.261900 | -19.824120 | 24.728833 | 24.018363 | 23.547288 | 23.286306 | 23.560544 | 0.252106 | 0.165683 | ... | 2.000000 | 2.000000 | 2.000000 | 2.000000 | 2.000000 | 3.000000 | 1.100445 | 0.734454 | 0.455501 | 0.362411 |
max | 1.700553e+09 | 359.999923 | 5.370579 | 99.000000 | 99.000000 | 24.000000 | 99.000000 | 99.000000 | 775.299316 | 178.056946 | ... | 19.000000 | 19.000000 | 19.000000 | 19.000000 | 19.000000 | 3.000000 | 80.933868 | 82.423227 | 5.788702 | 79.849766 |
8 rows × 23 columns
Plots¶
Spatial distribution with geoviews¶
Defining some general settings for the plot.
title = 'Galaxy spatial distribution - DES DR2'
height = 500
width = 1000
padding = 0.05
xlabel = 'R.A.'
ylabel = 'Dec.'
Aplying a coordinate transformation in R.A. and DEC.
coords = SkyCoord(ra=np.array(df_input.ra)*u.degree,
dec=np.array(df_input.dec)*u.degree, frame='icrs')
df_input.ra = np.array(coords.ra.wrap_at(180*u.degree))
df_input.dec = np.array(coords.dec)
Defining the longitudes and latitudes ticks.
longitudes = np.arange(30, 360, 30)
latitudes = np.arange(-75, 76, 15)
Generating the labels plot, containing just the latitude and longitude ticks.
lon_labels = [f"{lon}°" for lon in longitudes]
lat_labels = [f"{lat}°" for lat in latitudes]
labels_data = {
"lon": list(np.flip(longitudes)) + [-180] * len(latitudes),
"lat": [0] * len(longitudes) + list(latitudes),
"label": lon_labels + lat_labels,
}
df_labels = pd.DataFrame(labels_data)
labels_plot = gv.Labels(df_labels, kdims=["lon", "lat"], vdims=["label"]).opts(
text_font_size="12pt",
text_color="black",
text_align='right',
text_baseline='bottom',
projection=crs.Mollweide()
)
Building a geoviews Points element with R.A. and DEC. from the input sample. Here, we will multiply the R.A. coordinates by (-1) because it is a convention to invert the x axis in the graph. However, the ticks in the graph will be correctly shown. This is just a computational artifice.
ra_dec_points = gv.Points((-df_input['ra'],df_input['dec']), kdims=['ra', 'dec'])
Projecting the points in a Mollweide projection using geoviews and cartopy.
projected = gv.operation.project(ra_dec_points, projection=crs.Mollweide())
Applying datashader to the projected points.
dsh_points = dynspread(rasterize(projected).opts(cmap="Viridis", cnorm='log'))
dsh_points = dsh_points.opts(width=width, height=height, padding=padding, title=title, toolbar='above', colorbar=True, tools=['box_select'])
Plotting the datashaded points, the DES footprint and the grid object, and computing the elapsed time with the jupyter magic command time. As you can see, the DES footprint R.A. coordinates were also multiplied by (-1), for the same reason explained before.
%%time
ra_dec_foot = gv.Path((-foot_ra, foot_dec)).opts(line_width=3, color='orange')
grid = gf.grid()
final_plot = dsh_points * ra_dec_foot * grid * labels_plot
hv.save(final_plot, 'spatial_distribution.html')
CPU times: user 6.65 s, sys: 113 ms, total: 6.76 s Wall time: 6.86 s
IFrame('spatial_distribution.html', width=1200, height=600)
Magnitude distributions¶
Defining some general settings for the plot.
height = 400
width = 400
xlabel = 'mag'
n_spread = 6 ### Number to restrict the plot interval for each quantity. The larger this number, the larger the range in x.
x_threshold = 99 ### Maximum number in the x axis to plot.
num_mag_bins = 45 ### Number of mag bins in the x interval.
bands = ['g', 'r', 'i', 'z', 'Y']
Defining some dictionaries to save the numpy histogram counts and bins, the holoviews histogram and the holoviews dimentions.
mag_count = {}
mag_bin = {}
mag_distribution_histo = {}
mag_dim = {}
mag_freqs_dim = {}
Doing the plots for each band and computing the elapsed time with the jupyter magic command time.
%%time
for band in bands:
### DEFINING THE TITLES AND THE LABELS OF EACH PLOT
title = 'Distribution of Magnitudes - Band '+band
label_name_upper = 'mag '+band
### WE WILL USE ALL THE BANDS NAMES IN LOWER CASE FOR THE DICTIONARIES INDEXES.
band = band.lower()
### DEFINING THE NAME OF THE MAGNITUDE COLUMN AS IT IS IN THE DATA TABLE, AND ALSO DEFINING THE LABEL IN LOWER CASE.
catalog_mag_name = 'mag_auto_'+band+'_dered'
label_name = 'mag '+band
### DEFINING THE DIMENSIONS IN HOLOVIEWS.
### This is important because when we define different dimensions, the plots become independent (when zooming one axis in one plot, for exemple,
### no other plot will be zoomed).
mag_dim[band] = hv.Dimension(catalog_mag_name, label=label_name)
mag_freqs_dim[band] = hv.Dimension((catalog_mag_name+'_'+'freqs'), label=(label_name+' freqs'))
### DEFINING THE X LIMITS (MAG LIMITS) FOR EXHIBITION IN THE PLOTS AND FOR THE LATER BINS DEFINITION.
xlim_min = basic_stats[catalog_mag_name]['50%'] - n_spread*(np.abs(basic_stats[catalog_mag_name]['50%']-basic_stats[catalog_mag_name]['25%']))
xlim_max = basic_stats[catalog_mag_name]['50%'] + n_spread*(np.abs(basic_stats[catalog_mag_name]['50%']-basic_stats[catalog_mag_name]['75%']))
### COMPUTING THE BINS.
### Note that, although the plots are in the entire range (limited by the x_threshold), the bins are optimized for the above x limits.
step_mag_bins = (xlim_max-xlim_min)/num_mag_bins
mag_bins = np.arange(df_input[catalog_mag_name].min(),x_threshold+step_mag_bins,step_mag_bins)
### DEFINING THE NUMPY HISTOGRAM.
(mag_count[band], mag_bin[band]) = np.histogram(df_input[catalog_mag_name], bins=mag_bins)
### DEFINING THE Y LIMITS (FREQUENCY LIMITS) FOR EXHIBITION IN THE PLOTS.
### Here, I exclude the last element because it is for the 99. magnitudes. It could mess with the plot.
ylim_min = 0
ylim_max = mag_count[band][:-1].max()
### DOING THE HISTOGRAM WITH HOLOVIEWS.
mag_distribution_histo[band] = hv.Histogram((mag_count[band], mag_bin[band]), kdims=mag_dim[band], vdims=mag_freqs_dim[band]).opts(
title=title, xlabel=label_name_upper, ylabel='frequencies', height=height, width=width, xlim=(xlim_min, xlim_max), ylim=(ylim_min,ylim_max))
mag_distribution = (mag_distribution_histo['g'] + mag_distribution_histo['r'] + mag_distribution_histo['i']
+ mag_distribution_histo['z'] + mag_distribution_histo['y']).cols(2)
hv.save(mag_distribution, 'mag_distribution.html')
CPU times: user 721 ms, sys: 7.59 ms, total: 729 ms Wall time: 729 ms
IFrame('mag_distribution.html', width=1200, height=1300)
Error distributions¶
Defining some general settings for the plot.
height = 400
width = 400
xlabel = 'mag_error'
n_spread = 6 ### Number to restrict the plot interval for each quantity. The larger this number, the larger the range in x.
x_threshold = 5 ### Maximum number in the x axis to plot.
num_err_bins = 40 ### Number of error bins in the x interval.
bands = ['g', 'r', 'i', 'z', 'Y']
Defining some dictionaries to save the numpy histogram counts and bins, the holoviews histogram and the holoviews dimentions.
err_count = {}
err_bin = {}
err_distribution_histo = {}
err_dim = {}
err_freqs_dim = {}
Doing the plots for each band and computing the elapsed time with the jupyter magic command time.
%%time
for band in bands:
### DEFINING THE TITLES AND THE LABELS OF EACH PLOT
title = 'Distribution of Magnitude Errors - Band '+band
label_name_upper = 'mag error '+band
### WE WILL USE ALL THE BANDS NAMES IN LOWER CASE FOR THE DICTIONARIES INDEXES.
band = band.lower()
### DEFINING THE NAME OF THE ERROR COLUMN AS IT IS IN THE DATA TABLE, AND ALSO DEFINING THE LABEL IN LOWER CASE.
catalog_mag_name_err = 'magerr_auto_'+band
label_name_err = 'magerr '+band
### DEFINING THE DIMENSIONS IN HOLOVIEWS.
### This is important because when we define different dimensions, the plots become independent (when zooming one axis in one plot, for exemple,
### no other plot will be zoomed).
err_dim[band] = hv.Dimension(catalog_mag_name_err, label=label_name_err)
err_freqs_dim[band] = hv.Dimension((catalog_mag_name_err+'_'+'freqs'), label=(label_name_err+' freqs'))
### DEFINING THE X LIMITS (ERROR LIMITS) FOR EXHIBITION IN THE PLOTS AND FOR THE LATER BINS DEFINITION.
xlim_min = 0.
xlim_max = basic_stats[catalog_mag_name_err]['50%'] + n_spread*(np.abs(basic_stats[catalog_mag_name_err]['50%']-basic_stats[catalog_mag_name_err]['75%']))
### COMPUTING THE BINS.
### Note that, although the plots are in the entire range (limited by the x_threshold), the bins are optimized for the above x limits.
step_err_bins = (xlim_max-xlim_min)/num_err_bins
err_bins = np.arange(xlim_min,x_threshold+step_err_bins,step_err_bins)
### DEFINING THE NUMPY HISTOGRAM.
(err_count[band], err_bin[band]) = np.histogram(df_input[catalog_mag_name_err], bins=err_bins)
### DEFINING THE Y LIMITS (FREQUENCY LIMITS) FOR EXHIBITION IN THE PLOTS.
ylim_min = 0
ylim_max = err_count[band].max()
### DOING THE HISTOGRAM WITH HOLOVIEWS.
err_distribution_histo[band] = hv.Histogram((err_count[band], err_bin[band]), kdims=err_dim[band], vdims=err_freqs_dim[band]).opts(
title=title, xlabel=label_name_upper, ylabel='frequencies', height=height, width=width, xlim=(xlim_min, xlim_max), ylim=(ylim_min, ylim_max))
err_distribution = (err_distribution_histo['g'] + err_distribution_histo['r'] + err_distribution_histo['i'] + err_distribution_histo['z']
+ err_distribution_histo['y']).cols(2)
hv.save(err_distribution, 'error_distribution.html')
CPU times: user 718 ms, sys: 7.78 ms, total: 725 ms Wall time: 726 ms
IFrame('error_distribution.html', width=1200, height=1300)
Magnitude vs Error¶
Defining some general settings for the plot.
height = 400
width = 500
padding = 0.05
mag_n_spread = 6 ### Number to restrict the plot interval for each quantity. The larger this number, the larger the range in the plot.
err_n_spread = 6 ### Number to restrict the plot interval for each quantity. The larger this number, the larger the range in the plot.
bands = ['g', 'r', 'i', 'z', 'Y']
Defining some dictionaries to save the holoviews Points elements, the streams, the DynamicMap and the datashaded plots.
mag_vs_err = {} # Dict for saving hv.Points with mags and mag errors.
box_mag_vs_err = {} # Dict for saving streams.BoundsXY.
bounds_mag_vs_err = {} # Dict for saving hv_DynamicMap.
p_mag_vs_err = {} # Dict for saving the datashaded plots.
Doing the plots for each band and computing the elapsed time with the jupyter magic command time.
%%time
for band in bands:
### DEFINING THE TITLES AND THE LABELS OF EACH PLOT
title = 'Magnitude x Error - Band '+band
mag_label_name_upper = 'mag '+band
err_label_name_upper = 'mag error '+band
### WE WILL USE ALL THE BANDS NAMES IN LOWER CASE FOR THE DICTIONARIES INDEXES.
band_low = band.lower()
### FOR MAGNITUDES
### Defining the name of the magnitude columns as it is in the data table, and also defining the lower case magnitude label (this will also be our holoviews dimension).
mag_catalog_name = 'mag_auto_'+band_low+'_dered'
mag_label_name = 'mag '+band_low
### Defining the magnitudes limits for exhibition in the plots.
mag_xlim_min = basic_stats[mag_catalog_name]['50%'] - mag_n_spread*(np.abs(basic_stats[mag_catalog_name]['50%']-basic_stats[mag_catalog_name]['25%']))
mag_xlim_max = basic_stats[mag_catalog_name]['50%'] + mag_n_spread*(np.abs(basic_stats[mag_catalog_name]['50%']-basic_stats[mag_catalog_name]['75%']))
### FOR MAGNITUDES ERRORS
### Defining the name of the errors columns as it is in the data table, and also defining the lower case error label (this will also be our holoviews dimension).
err_catalog_name = 'magerr_auto_'+band_low
err_label_name = 'mag '+band_low+' error'
### Defining the error limits for exhibition in the plots.
err_xlim_min = 0.
err_xlim_max = basic_stats[err_catalog_name]['50%'] + err_n_spread*(np.abs(basic_stats[err_catalog_name]['50%']-basic_stats[err_catalog_name]['75%']))
### DEFINING THE POINTS ELEMENT IN HOLOVIEWS, CONTAINING THE MAGNITUDES AND ERRORS.
mag_vs_err[band_low] = hv.Points((df_input[mag_catalog_name],df_input[err_catalog_name]), kdims=[mag_label_name,err_label_name])
### CREATING THE LINKED STREAMS INSTANCE
boundsxy_mag_vs_err = (0, 0, 0, 0)
box_mag_vs_err[band_low] = streams.BoundsXY(source=mag_vs_err[band_low], bounds=boundsxy_mag_vs_err)
bounds_mag_vs_err[band_low] = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box_mag_vs_err[band_low]])
### USING DATASHADER
# 1) Types of cnorm -> log, linear, eq_hist. Eq_hist is the better for the color when we don't know about the incoming distribution
# but is bad for the numbers in the colorbar.
# 2) Use clim=(0, 10000) in the opts to fix the colorbar range.
p_mag_vs_err[band_low] = dynspread(rasterize(mag_vs_err[band_low]).opts(cmap="Viridis", cnorm='log'))
p_mag_vs_err[band_low] = p_mag_vs_err[band_low].opts(width=width, height=height, padding=padding, show_grid=True, xlim=(mag_xlim_min, mag_xlim_max),
ylim=(err_xlim_min, err_xlim_max), xlabel=mag_label_name_upper, ylabel=err_label_name_upper, tools=['box_select'],
xticks=5, yticks=5, title=title, toolbar='above', colorbar=True)
mag_vs_err_plot = ((p_mag_vs_err['g'] * bounds_mag_vs_err['g']) + (p_mag_vs_err['r'] * bounds_mag_vs_err['r']) + (p_mag_vs_err['i'] * bounds_mag_vs_err['i'])
+ (p_mag_vs_err['z'] * bounds_mag_vs_err['z']) + (p_mag_vs_err['y'] * bounds_mag_vs_err['y'])).cols(2)
hv.save(mag_vs_err_plot, 'mag_vs_err_plot.html')
CPU times: user 4.82 s, sys: 223 ms, total: 5.05 s Wall time: 5.12 s
IFrame('mag_vs_err_plot.html', width=1200, height=1300)
Color-Magnitude Diagrams¶
Defining some general settings for the plot.
height = 400
width = 500
padding = 0.05
mag_n_spread = 6 ### Number to restrict the plot interval for each quantity. The larger this number, the larger the range in the plot.
color_n_spread = 6 ### Number to restrict the plot interval for each quantity. The larger this number, the larger the range in the plot.
plots_names_color_mag = ['g','(g-r)','r','(r-i)','i','(i-z)','z','(z-Y)', 'i', '(g-r)']
Defining some dictionaries to save the holoviews Points elements, the streams, the DynamicMap and the datashaded plots.
mag_vs_color = {}
box_mag_vs_color = {}
bounds_mag_vs_color = {}
p_mag_vs_color = {}
Doing the plots for each band vs color and computing the elapsed time with the jupyter magic command time.
%%time
j=1
for x in range((len(plots_names_color_mag)-1)):
if x%2 == 0:
### DEFINING THE TITLES AND THE LABELS OF EACH PLOT
mag_name = plots_names_color_mag[x]
mag_name_low = mag_name.lower()
color_name = plots_names_color_mag[x+1]
color_name_low = color_name.lower()
title = 'Color-Magnitude Diagram - '+mag_name+' x '+color_name
mag_label_name_upper = 'mag '+mag_name
color_label_name_upper = 'color '+color_name
### WE WILL USE ALL THE BANDS AND COLORS NAMES IN LOWER CASE FOR THE DICTIONARIES INDEXES.
plot_name = mag_name_low+'_'+color_name_low
### FOR MAGNITUDES
### Defining the name of the magnitude columns as it is in the data table, and also defining the magnitude label (this will also be our holoviews dimension).
mag_catalog_name = 'mag_auto_'+mag_name_low+'_dered'
mag_label_name = 'mag '+mag_name+' plot '+str(j) # This str(j) here is important for independence between plots.
### Defining the magnitudes limits for exhibition in the plots.
mag_xlim_min = basic_stats[mag_catalog_name]['50%'] - mag_n_spread*(np.abs(basic_stats[mag_catalog_name]['50%']-basic_stats[mag_catalog_name]['25%']))
mag_xlim_max = basic_stats[mag_catalog_name]['50%'] + mag_n_spread*(np.abs(basic_stats[mag_catalog_name]['50%']-basic_stats[mag_catalog_name]['75%']))
### FOR COLORS
### Defining the name of the colors columns as it is in the data table, and also defining the color label (this will also be our holoviews dimension).
color_catalog_name = 'mag_auto_'+color_name_low+'_dered'
color_label_name = 'color '+color_name+' plot '+str(j) # This str(j) here is important for independence between plots.
### Defining the color limits for exhibition in the plots.
color_xlim_min = basic_stats[color_catalog_name]['50%'] - color_n_spread*(np.abs(basic_stats[color_catalog_name]['50%']-basic_stats[color_catalog_name]['25%']))
color_xlim_max = basic_stats[color_catalog_name]['50%'] + color_n_spread*(np.abs(basic_stats[color_catalog_name]['50%']-basic_stats[color_catalog_name]['75%']))
### DEFINING THE POINTS ELEMENT IN HOLOVIEWS, CONTAINING THE MAGNITUDES AND COLORS.
mag_vs_color[plot_name] = hv.Points((df_input[mag_catalog_name],df_input[color_catalog_name]), kdims=[mag_label_name,color_label_name])
### CREATING THE LINKED STREAMS INSTANCE
boundsxy_mag_vs_color = (0, 0, 0, 0)
box_mag_vs_color[plot_name] = streams.BoundsXY(source=mag_vs_color[plot_name], bounds=boundsxy_mag_vs_color)
bounds_mag_vs_color[plot_name] = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box_mag_vs_color[plot_name]])
### USING DATASHADER
# 1) Types of cnorm -> log, linear, eq_hist. Eq_hist is the better for the color when we don't know about the incoming distribution
# but is bad for the numbers in the colorbar.
# 2) Use clim=(0, 10000) in the opts to fix the colorbar range.
p_mag_vs_color[plot_name] = dynspread(rasterize(mag_vs_color[plot_name]).opts(cmap="Viridis", cnorm='log'))
p_mag_vs_color[plot_name] = p_mag_vs_color[plot_name].opts(width=width, height=height, padding=padding, show_grid=True, xlim=(mag_xlim_min, mag_xlim_max),
ylim=(color_xlim_min, color_xlim_max), xlabel=mag_label_name_upper, ylabel=color_label_name_upper, tools=['box_select'],
xticks=5, yticks=5, title=title, toolbar='above', colorbar=True)
j+=1
mag_vs_color_plot = ((p_mag_vs_color['g_(g-r)'] * bounds_mag_vs_color['g_(g-r)']) + (p_mag_vs_color['r_(r-i)'] * bounds_mag_vs_color['r_(r-i)']) +
(p_mag_vs_color['i_(i-z)'] * bounds_mag_vs_color['i_(i-z)']) + (p_mag_vs_color['z_(z-y)'] * bounds_mag_vs_color['z_(z-y)']) +
(p_mag_vs_color['i_(g-r)'] * bounds_mag_vs_color['i_(g-r)'])).cols(2)
hv.save(mag_vs_color_plot, 'mag_vs_color_plot.html')
CPU times: user 4.68 s, sys: 185 ms, total: 4.87 s Wall time: 4.94 s
IFrame('mag_vs_color_plot.html', width=1200, height=1300)
Color-Color Diagrams¶
Defining some general settings for the plot.
height = 400
width = 500
padding = 0.05
color_1_n_spread = 6 ### Number to restrict the plot interval for each quantity. The larger this number, the larger the range in the plot.
color_2_n_spread = 6 ### Number to restrict the plot interval for each quantity. The larger this number, the larger the range in the plot.
plots_names_color_color = ['(g-r)','(r-i)','(r-i)','(i-z)','(i-z)','(z-Y)']
Defining some dictionaries to save the holoviews Points elements, the streams, the DynamicMap and the datashaded plots.
color_vs_color = {}
box_color_vs_color = {}
bounds_color_vs_color = {}
p_color_vs_color = {}
Doing the plots for each color vs color and computing the elapsed time with the jupyter magic command time.
%%time
j=1
for x in range((len(plots_names_color_color)-1)):
if x%2 == 0:
### DEFINING THE TITLES AND THE LABELS OF EACH PLOT
color_1_name = plots_names_color_color[x]
color_1_name_low = color_1_name.lower()
color_2_name = plots_names_color_color[x+1]
color_2_name_low = color_2_name.lower()
title = 'Color-Color Diagram - '+color_1_name+' x '+color_2_name
color_1_label_name_upper = 'color '+color_1_name
color_2_label_name_upper = 'color '+color_2_name
### WE WILL USE ALL THE BANDS AND COLORS NAMES IN LOWER CASE FOR THE DICTIONARIES INDEXES.
plot_name = color_1_name_low+'_'+color_2_name_low
### FOR COLOR 1
### Defining the name of the color 1 column as it is in the data table, and also defining the color 1 label (this will also be our holoviews dimension).
color_1_catalog_name = 'mag_auto_'+color_1_name_low+'_dered'
color_1_label_name = 'color '+color_1_name+' plot '+str(j)
### Defining the color 1 limits for exhibition in the plots.
color_1_xlim_min = basic_stats[color_1_catalog_name]['50%'] - color_1_n_spread*(np.abs(basic_stats[color_1_catalog_name]['50%']-basic_stats[color_1_catalog_name]['25%']))
color_1_xlim_max = basic_stats[color_1_catalog_name]['50%'] + color_1_n_spread*(np.abs(basic_stats[color_1_catalog_name]['50%']-basic_stats[color_1_catalog_name]['75%']))
### FOR COLOR 2
### Defining the name of the color 2 column as it is in the data table, and also defining the color 2 label (this will also be our holoviews dimension).
color_2_catalog_name = 'mag_auto_'+color_2_name_low+'_dered'
color_2_label_name = 'color '+color_2_name+' plot '+str(j)
### Defining the color 2 limits for exhibition in the plots.
color_2_xlim_min = basic_stats[color_2_catalog_name]['50%'] - color_2_n_spread*(np.abs(basic_stats[color_2_catalog_name]['50%']-basic_stats[color_2_catalog_name]['25%']))
color_2_xlim_max = basic_stats[color_2_catalog_name]['50%'] + color_2_n_spread*(np.abs(basic_stats[color_2_catalog_name]['50%']-basic_stats[color_2_catalog_name]['75%']))
### DEFINING THE POINTS ELEMENT IN HOLOVIEWS, CONTAINING COLOR 1 AND COLOR 2
color_vs_color[plot_name] = hv.Points((df_input[color_1_catalog_name],df_input[color_2_catalog_name]), kdims=[color_1_label_name,color_2_label_name])
### CREATING THE LINKED STREAMS INSTANCE
boundsxy_color_vs_color = (0, 0, 0, 0)
box_color_vs_color[plot_name] = streams.BoundsXY(source=color_vs_color[plot_name], bounds=boundsxy_color_vs_color)
bounds_color_vs_color[plot_name] = hv.DynamicMap(lambda bounds: hv.Bounds(bounds), streams=[box_color_vs_color[plot_name]])
### USING DATASHADER
# 1) Types of cnorm -> log, linear, eq_hist. Eq_hist is the better for the color when we don't know about the incoming distribution
# but is bad for the numbers in the colorbar.
# 2) Use clim=(0, 10000) in the opts to fix the colorbar range.
p_color_vs_color[plot_name] = dynspread(rasterize(color_vs_color[plot_name]).opts(cmap="Viridis", cnorm='log'))
p_color_vs_color[plot_name] = p_color_vs_color[plot_name].opts(width=width, height=height, padding=padding, show_grid=True, xlim=(color_1_xlim_min, color_1_xlim_max),
ylim=(color_2_xlim_min, color_2_xlim_max), xlabel=color_1_label_name_upper, ylabel=color_2_label_name_upper, tools=['box_select'],
xticks=5, yticks=5, title=title, toolbar='above', colorbar=True)
j+=1
color_vs_color_plot = ((p_color_vs_color['(g-r)_(r-i)'] * bounds_color_vs_color['(g-r)_(r-i)']) + (p_color_vs_color['(r-i)_(i-z)'] * bounds_color_vs_color['(r-i)_(i-z)']) +
(p_color_vs_color['(i-z)_(z-y)'] * bounds_color_vs_color['(i-z)_(z-y)'])).cols(2)
hv.save(color_vs_color_plot, 'color_vs_color_plot.html')
CPU times: user 2.88 s, sys: 135 ms, total: 3.02 s Wall time: 3.06 s
IFrame('color_vs_color_plot.html', width=1200, height=1000)