from nbtemplate import display_header, get_path
display_header('Performance_metrics.ipynb', status='Arcus Probe')
Last revision in version control:
The version shown here is modified compared to the last commited revision.
This document is git version controlled. The repository is available at https://github.com/hamogu/arcus. See git commit log for full revision history.
Code was last run with:
In this notebook, I show the effective area and resolution we can expect for Arcus with CAT gratings in the nominal design and according to the baseline tolerance budget.
This document contains mostly results of simulations, not the code itself because many of the plots below require a grid of simulations, e.g. in for different energies, and take while to run. Thus, those grids are run in a separate process and the relevant results (e.g. the position of the final rays on the detector) is saved in a fits file. In this notebook we then parse these fits files to extract the relevant information.
Details of the setup for the simulations are obviously defined in the python code, but a list of the most important effects that are included is here:
No simulation is perfect and there are many details of the Arcus design that are not or only very marginally relevant for the optical performance that are not covered in these simulations. The main caveats with respect to the optical design are:
import os
import sys
from glob import glob
import functools
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
%matplotlib inline
import astropy
import astropy.coordinates
from astropy.table import Table, Column
import astropy.units as u
plt.style.use("fivethirtyeight")
#plt.style.use("seaborn")
from cycler import cycler
import marxs
from marxs.source import PointSource, FixedPointing
from marxs.analysis.gratings import resolvingpower_from_photonlist, weighted_per_order
from marxs.missions.arcus import arcus
from marxs.missions.arcus.analyze_grid import orders_from_meta
from marxs.analysis.plots import OrderColor
#apertures = list(arcus.arcus.defaultconf['pos_opt_ax'].keys())
get_path('arcus')
'/Users/guenther/MITDropbox/ARCUS'
tarc = Table.read(os.path.join(get_path('arcus'), 'raygridRAeff.fits'))
tarc.orders = orders_from_meta(tarc.meta)
tper = Table.read(os.path.join(get_path('arcus'), 'raygrid-perfectRAeff.fits'))
tper.orders = orders_from_meta(tper.meta)
print('Files were written with')
print('MARXS version', tarc.meta['MARXSVER'])
print('ARCUS inputdata data version', tarc.meta['ARCDATDA'])
print('ARCUS inputdata git hash', tarc.meta['ARCDATHA'])
print('NOTE THAT THESE VERSIONS CAN BE DIFFERENT FROM THE LAST TIME THE NOTEBOOK WAS RUN')
Files were written with MARXS version 1.3.dev96 ARCUS inputdata data version 2023-01-13 11:41:07 ARCUS inputdata git hash 39c026a NOTE THAT THESE VERSIONS CAN BE DIFFERENT FROM THE LAST TIME THE NOTEBOOK WAS RUN
oc = OrderColor(max_order=15)
# Arguments used for fill_between for summed Aeff and averaged R
args_between = {'color': 'k', 'alpha': 0.5}
apertures = [1,2,3,4]
def plot_aeff(tab, aeff, plot_orders, ordercolor, ax=None):
if ax is None:
fig = plt.figure()
ax = fig.add_subplot(111)
ax.fill_between(tab['wave'], np.sum(aeff[:, tab.orders != 0], axis=1),
**args_between, label='all dispersed\norders')
for o in plot_orders:
ax.plot(tab['wave'], aeff[:, tab.orders == o], label='order {0}'.format(o),
**ordercolor(o))
temp = ax.set_xlim([8, 55])
return ax
# Do the same plot but split up by energy ranges
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot(111)
plot_aeff(tper, tper['Aeff4'], np.arange(-12, 3), oc, ax)
ax.set_title('Effective area summed over all 4 channels')
fig.subplots_adjust(bottom=0.13, left=0.13, top=.85)
ax.set_xlim([0, 70])
ax.legend()
ax.set_xlabel('wavelength [$\AA{}$]')
ax.set_ylabel('$A_\mathrm{eff}$ [cm$^2]$')
fig.savefig(os.path.join(get_path('figures'), 'Aeff_16CCDs.png'), dpi=600, bbox_inches='tight')
fig.savefig(os.path.join(get_path('figures'), 'Aeff_16CCDs.pdf'))
Effective area curve for Arcus based on the instrument setup and nominal alignment for every element. The effective area for all four channels is added together. The contribution of individual grating orders is shown with colored curves. Each of the curves has a number of dips in it when an order falls into a chip gap. For any individual channel, the effective area drops to 0 in this case, but the channels are placed such the the chip gaps do not coincide in all of them, so the summed effective area suffers, but some signal is still detected.
The dark gray histogram sums the effective area of all detected dispersed orders, irrespective of their spectral resolving power. However, for most of the wavelength range, only orders that are very similar in resolving power (e.g. orders 4 and 5 at 30 Å) contribute anyway. The only exception to this rule is the region around 12 Å, where both order 1 and 9 contribute to the observed signal, with (as we will see below) significantly different $R$.
# Do the same plot but split up by channels
fig, axes = plt.subplots(2, 2, sharex=True, sharey=True, figsize=(8, 6))
for i, a in enumerate(apertures):
ax = axes[np.unravel_index(i, axes.shape)]
plot_aeff(tper, tper['Aeff'][:, i, :], np.arange(-12, 3), oc, ax)
ax.set_title('channel {}'.format(a))
for i in [0, 1]:
axes[1, i].set_xlabel('wavelength [$\AA{}$]')
axes[i, 0].set_ylabel('$A_\mathrm{eff}$ [cm$^2]$')
fig.subplots_adjust(bottom=0.13, left=0.13, top=.85, wspace=0.1, hspace=0.15)
#fig.savefig(os.path.join(get_path('figures'), 'Aeff_16CCDs.png'))
#fig.savefig(os.path.join(get_path('figures'), 'Aeff_16CCDs.pdf'))
This plot shows the effective area for each of the four channels separately, using the same color coding as above. The channels are very similar and at this scale the difference can hardly be seen, but in particular there is a slight offset in the position of the chip gaps between channel 0 and 1 on the one side and 2 and 3 on the other. Note that for each order, the effective area drops to 0, when the order falls into a chip gap, but the effective area summed over all orders does not, at least in those regions where more than one order contributes signal.
# Do the same plot but split up by channels
fig, axes = plt.subplots(2, 2, sharex=True, sharey=True, figsize=(8, 6))
for i, a in enumerate(apertures):
ax = axes[np.unravel_index(i, axes.shape)]
plot_aeff(tarc, tarc['Aeff'][:, i, :], np.arange(-9, 3), oc, ax)
ax.set_title('channel {}'.format(a))
ax.set_xlim(25, 27)
for i in [0, 1]:
axes[1, i].set_xlabel('wavelength [$\AA{}$]')
axes[i, 0].set_ylabel('$A_\mathrm{eff}$ [cm$^2]$')
fig.subplots_adjust(bottom=0.13, left=0.13, top=.85, wspace=0.1, hspace=0.15)
This plot shows a zoom in on one of the deep chip gaps. The wavelength resolution of the simulation is just sufficient to seed the chip gap in every case, not to resolve it (it appears wider than it should here, because not enough wavelength points are simulated), but the plots show that the placement of the channels places the chip gaps at different wavelengths.
I define the resolving power as: $R = \frac{\lambda}{\Delta \lambda} = \frac{d_x}{FWMH}$ where $\lambda$ is the wavelength of a spectral line with negligible intrinsic width, and $\Delta \lambda$ is the observed width of this feature. Since the detector does not give the wavelength directly, $d_x$ and the $FWHM$ are linear distances measured as follows: Events that hit a CCD are projected (not propagated, that would bring them out of focus) into a plane. The $FWMH$ is the full width at half maximum of the event distribution and $d_x$ is the distance between the center of a diffracted order and the zeroth order. Since the CCDs are flat and do not follow the Rowland circle exactly, the $R$ is somewhat lower than the $R$ expected on the Rowland circle. However, this effect is so small that it is negligible in practice. There is one problem with this procedure: If the center of the photon distribution is very close to a chip gap, then only a faction of the true distribution is visible on the chip and naturally the $FWHM$ of this distribution will be narrower than the true distribution. Thus, we also calculate the $R$ for a cylindrical detector on the Rowland circle and use the $R$ derived this way for positions close to the chip edge.
def plot_res(tab, res, orders, ordercolor, ax=None):
if ax is None:
fig = plt.figure()
ax = fig.add_subplot(111)
for o in orders:
i = tab.orders == o
ax.plot(tab['wave'], res[:, i], label='order {0}'.format(tab.orders[i][0]),
**ordercolor(o))
temp = ax.set_xlim([8, 55])
return ax
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot(111)
ax.fill_between(tper['wave'], tarc['R_disp'], **args_between, label='average')
ax = plot_res(tper, tper['R4'], orders=np.arange(-14, 5), ordercolor=oc, ax=ax)
ax.set_xlabel('wavelength [$\AA{}$]')
ax.set_title('Spectral resolving power')
ax.set_ylabel('resolving power')
ax.legend()
ax.set_xlim(10., 70.)
#ax.set_xlim(5., 50.)
#ax.set_ylim(2400, 4400)
fig.subplots_adjust(bottom=0.13, left=0.13, top=.85)
fig.savefig(os.path.join(get_path('figures'), 'respower.png'), dpi=600, bbox_inches='tight')
fig.savefig(os.path.join(get_path('figures'), 'respower.pdf'))
Spectral resolving power for individual orders (colored lines). The resolving power is only calculated if a certain minimum number of photons is detected in an order. If most photons fall into a chip gap, then values on the curve might be missing. For each order, the resolving power increases with wavelength as the distance between the diffracted signal and the zeroth order increases. The gray line shows and average over all grating orders.
An interesting detail here is to compare positive and negative orders, e.g. -2 and +2. Arcus has two distinct cameras with eight CCDs each, which are place quite far apart. signal in positive order 2 is detected only for short wavelengths, where the signal falls close to the detector. However, due to the position of the zeroth order close to the edge of a detector, the negative second order is only visible once it falls on the far detector, i.e. at longer wavelengths. Yet, the relation between resolving power and wavelength falls on the same line.
def plot_res_mlambda(tab, res, orders, ordercolor, ax=None):
if ax is None:
fig = plt.figure()
ax = fig.add_subplot(111)
for o in orders:
i = tab.orders == o
ax.plot(np.abs(tab['wave'] * o), res[:, i],
label='order {0}'.format(tab.orders[i][0]),
**ordercolor(o))
return ax
fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot(111)
ax = plot_res_mlambda(tarc, tarc['R4'], orders=np.arange(-10, 5),
ordercolor=oc, ax=ax)
ax.legend(bbox_to_anchor=(1.05, 1.05))
temp = ax.set_title('Spectral resolving power')
ax.set_xlabel(' m * wavelength [$\AA{}$]')
ax.set_ylabel('resolving power')
fig.subplots_adjust(bottom=0.13, left=0.13, top=.85)
fig.savefig(os.path.join(get_path('figures'), 'respowerml.png'), dpi=600, bbox_inches='tight')
fig.savefig(os.path.join(get_path('figures'), 'respowerml.pdf'), bbox_inches='tight')
This plot shows the resolving power over $m$ (the order number) times the wavelength. $m\lambda$ sets the dispersion angle, so this confirms the point made above: The observed resolving power is essentially only a function of the dispersion angle where the photons are found and other parameters, such as the energy, are not important. Photons with a high energy are better reflected at small angles and thus the SPOs in the inner rows are more important than for photons with a lower energy. Thus, one expects an energy dependent PSF and possibly an energy dependent resolving power at the same diffraction angle. However, at the current level of detail for the mirror simulation this is not visible.
fig, axes = plt.subplots(2, 2, sharex=True, sharey=True, figsize=(8, 6))
for i, a in enumerate(apertures):
ax = plot_res(tper, tper['R'][:, i, :], orders=np.arange(-10, 5),
ordercolor=oc, ax=axes[np.unravel_index(i, axes.shape)])
temp = ax.set_title('R for channel {}'.format(a))
#ax.plot(wave, np.ma.average(res_ma[:, 0, :-1], weights=aeff[:, :-1], axis=1), 'k:')
for i in [0, 1]:
axes[1, i].set_xlabel('wavelength [$\AA{}$]')
axes[i, 0].set_ylabel('resolving power')
fig.subplots_adjust(bottom=0.13, left=0.13, top=.85, wspace=0.1, hspace=0.15)
fig.subplots_adjust(bottom=0.13, left=0.13, top=.85)
#fig.savefig(os.path.join(get_path('figures'), 'respower.png'))
#fig.savefig(os.path.join(get_path('figures'), 'respower.pdf'))
Using the same color scheme as above, these plots show the resolving power for individual channels. Except for the chip gaps, the channels are essentially identical.
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(tper['wave'], tper['R_disp'], 'k', alpha=.5, lw=5.,label='perfect alignment')
ax.plot(tarc['wave'], tarc['R_disp'], 'r', alpha=.5, lw=5, label='baseline tolerance')
ind = tper.orders != 0
#ax.plot(tper['wave'],
# np.ma.average(np.ma.masked_invalid(tper['R4'][:, ind]),
# weights=np.ma.masked_less(tper['Aeff4'][:, ind], 20),
# axis=1), 'k', lw=2, ls='solid', label='__no_legend__')
#ax.plot(tarc['wave'],
# np.ma.average(np.ma.masked_invalid(tarc['R4'][:, ind]),
# weights=np.ma.masked_less(tarc['Aeff4'][:, ind], 20),
# axis=1), 'r', lw=2, ls='solid', label='__no_legend__')
ax.set_xlabel('wavelength [$\AA{}$]')
ax.set_ylabel('resolving power')
fig.subplots_adjust(bottom=0.13, left=0.13, top=.85)
ax.legend(loc='lower right')
fig.savefig(os.path.join(get_path('figures'), 'averageres.png'), bbox_inches='tight')
fig.savefig(os.path.join(get_path('figures'), 'averageres.pdf'), bbox_inches='tight')
This figure shows the resolving power averaged over all channels and all dispersed spectral orders. When averaging, values are weighted by their respective effective area. The thick lines average truly over all dispersed orders including positive and negative orders. For all wavelengths, more than one order contributes and dips in $R$ occur when one order falls into a chip gap and the remaining orders have a lower $R$. This highlights the importance of a careful placement of the chip gaps or dithering to smooth out the gaps. In the region 38-42 Å only a single order contributes, and so we see the $R$ rising with wavelength as in the previous plots. For small wavelengths $<12$ Å very low orders contribute to the signal, bringing down the average $R$.
The plot shows almost no difference between the perfect alignment and the baseline alignment tolerance scenario. This is consistent with the results in the Alignment budget memo where the drop in $R$ is only one or two percent. If at all, the baseline alignment tolerance curve looks a little smoother and the drops or peaks (e.g. around 46 Å) that happen when one order falls into a chip gap are less pronounced. This is likely just an artifact of the wavelength step used in these simulations. If the signal is spread out a little more on the CCD, then photons are still collected in cases where the perfect alignment scenario has insufficient signal to calculate $R$.
Over most of the wavelength range, $R$ is almost constant. That is not a coincidence, but rooted in the physics of CAT gratings. The spectral resolving power is measured as $R = \frac{d_x}{FWMH}$. The $FWHM$ is dominated by the mirror point spread function (PSF). In practice the PSF is slightly energy dependent, but this is not captured in the simple mirror model used for these simulations. All other effects on the $FWHM$ are very small (for example, it technically also depends on where on a CCD the signal falls since the CCDs are not curved and do not follow the Rowland circle exactly, but for Arcus this effect is negligible). So, the only variable left is $d_x$, the distance of the spot from the zeroth order. Photons with a longer wavelength are dispersed to larger distances and thus have larger $d_x$ which explains why $R$ increases with wavelength for every order. On the other hand, CAT gratings most efficiently diffract photons to a particular angle (the blaze peak). Thus, for longer and longer wavelengths the signal in an order becomes weaker, because the position of the order moves away from the blaze peak. At the same time, the next lower order moves towards the blaze peak and becomes increasingly stronger. In other words, the higher order (which has a higher $R$) drops in effective area, while the next lower order gains effective area. Together, this means that the averaged resolving power stays essentially constant.
def combine_figures_of_merit(aeff, res):
out = np.zeros(aeff.shape)
for i in range(len(out)):
res_i = res[:, 0, i]
sort_res = np.argsort(res_i)
merits = np.sqrt(res[:, i] * aeff[:, i])
out[i] = np.nansum(merits[sort_res[:2]])
return out
def plot_merit(tab, aeff, res, orders, ordercolor):
fig = plt.figure()
ax = fig.add_subplot(111)
#ax.plot(wave, combine_figures_of_merit(aeff, res), 'k',
# label='top 2 orders\nadded')
for o in orders:
i = tab.orders == o
ax.plot(tab['wave'], np.sqrt(res[:, i] * aeff[:, i]),
label='order {0}'.format(tab.orders[i][0]),
**ordercolor(o))
ax.legend(ncol=1, bbox_to_anchor=(1.05, 1.05))
ax.set_xlabel('wavelength [$\AA{}$]')
ax.set_ylabel('Figure of mrrit [cm]')
return fig, ax
fig, ax = plot_merit(tper, tper['Aeff4'], tper['R4'], np.arange(-10, 3), ordercolor=oc)
ax.set_xlim([5, 75])
temp = ax.set_title('Figure of merit for line detection')
The figure of merit for line detection is $\sqrt{R*A_\mathrm{eff}}$. It can be derived from the plots above and is shown here for the most relevant orders. The figure of merit changes with the scientific task at hand (measure line shifts, measure line fluxes), but this particular form is used in the Arcus proposal and thus shown here for illustration.
Chip gaps cause drops in the effective area and thus in the figure of merit. The most extreme drops have been omitted in the figure above.
The plots above describe the wavelength-dependent $R$ and $A_\mathrm{eff}$. However, for a descriptive text, we often need to summarize this by averaging over bands.
wvlintervals = [[21.6, 28], [33.7, 40.], [16., 21.6], [12., 50.]]
from marxs.analysis.gratings import average_R_Aeff
avg_performance = Table(names=['from','to', 'Aeff', 'R'],
units=[u.Angstrom, u.Angstrom, u.cm**2, u.dimensionless_unscaled],
dtype=[float, float, float, float])
for j, interval in enumerate(wvlintervals):
ind = (tarc['wave'] > interval[0]) & (tarc['wave'] < interval[1])
rdisp = tarc['R_disp'][ind]
aeffdisp = tarc['Aeff_disp'][ind]
ravg, temp = average_R_Aeff(rdisp, aeffdisp)
aavg = np.average(aeffdisp)
avg_performance.add_row([interval[0], interval[1], aavg, ravg])
avg_performance['Aeff'].format = '{:6.1f}'
avg_performance['R'].format = '{:6.1f}'
avg_performance
from | to | Aeff | R |
---|---|---|---|
Angstrom | Angstrom | cm2 | |
float64 | float64 | float64 | float64 |
21.6 | 28.0 | 468.8 | 3551.8 |
33.7 | 40.0 | 284.9 | 3409.9 |
16.0 | 21.6 | 516.7 | 3610.6 |
12.0 | 50.0 | 351.6 | 3503.7 |
The table summarizes average effective area and resolving power in a few spectral bands that are called out specifically in the STM. For this table, $R$ is the area-weighted average, but in practice the average $R$ that an observation would deliver depends on the spectral shape. If most of the signal is in a region of the wavelength with higher-than-average $R$ then $R$ averaged over the observed photon count number might also be higher than the number in the table and vice-versa. That is why we use the wavelength-dependent $R$ in the form of the response matrix for detailed simulations.
from marxs.missions.arcus.analyze_grid import csv_per_order
csv_per_order(os.path.join(get_path('arcus'), 'raygridRAeff.fits'), 'Aeff4',
os.path.join(get_path('figures'), 'Aeff.csv'))
csv_per_order(os.path.join(get_path('arcus'), 'raygridRAeff.fits'), 'R4',
os.path.join(get_path('figures'), 'R.csv'))
csv_per_order(os.path.join(get_path('arcus'), 'raygrid-perfectRAeff.fits'), 'Aeff4',
os.path.join(get_path('figures'), 'Aeff-perfect.csv'))
csv_per_order(os.path.join(get_path('arcus'), 'raygrid-perfectRAeff.fits'), 'R4',
os.path.join(get_path('figures'), 'R-perfect.csv'))
In this memo, I have shown the performance of Arcus over the design bandpass of the gratings for nominal SPOs (no coating). All the "data behind the figures" are available in a Dropbox account, please contact Moritz for the links.