At the heart of the Arcus design are the critical angle transmission (CAT) gratings, which diffract the light. The Space Nanotechnology Laboratory at MIT is constantly working on improving parameters and fabrication and thus different variants with different thickness, different grating bar width and with or without a coating could be used for Arcus. Depending on the grating parameters, the gratings will have to mounted at different blaze angles for optimal extraction, and that means that the spacing between channels and detectors needs to be different for optimal performance. In this notebook, we look at those optimizations and the expected performance for different grating types, so that the Arcus team has the information it needs to trade-off between capabilities and technical readiness of the gratings.
from nbtemplate import display_header, get_path
display_header('SelectCATtype.ipynb')
Last revision in version control:
This document is git version controlled. The repository is available at https://github.com/hamogu/arcus. See git commit log for full revision history.
Code was last run with:
In an early stage of the Arcus design, there are just too many parameters in flux, to exhaustively run a grid of models over all possible parameter combinations. Thus, I am making some choices which parameters are likely to impact the results in a meaningful way, and which parameters are already set by other constraints, e.g. the SPOs are chosen to have an exact equivalent in Athena for radius and size, but their angular position can be modified to make them fit the Arcus petal. While in principle it is possible to manufacture different SPOs, minor changes (e.g. cutting off the edges by a few mm) will not change the conclusions drawn here and major changes are cost prohibitive for the Arcus mission. I am selecting a set of SPOs and positions and run all simulations with that. Any later changes (e.g. adding or removing SPO rows) will impact all scenarios in the same direction, e.g. removing row 1 reduces the effective area ($A_\mathrm{eff}$ or sometimes simply labelled as $A$ for short below), but increases the resolving power $R$. While this may change the absolute numbers given for $A$ and $R$ below, a grating type that performs better than another type with SPOs in row 1 will still perform better than the other type without SPOs in row 1.
The selection of SPOs is studied separately in a different place.
So, the main parameters changing in the simulations below are:
import warnings
from astropy import table
import astropy.units as u
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns
%matplotlib inline
from glob import glob
orders = np.arange(-15, 4)
outpath = get_path('grid2designtorus')
filelist = glob(os.path.join(outpath, '*.fits'))
filelist.sort()
def add_phifolded(p):
p.meta['phi_m'] = np.arcsin(p.meta['D_CHAN'] / 2 / p.meta['CIRCLE_R'])
p.meta['phi_0'] = 2 * p.meta['phi_m']
# make new column "distance from phi_m"
p['phi_folded'] = np.abs(p['circ_phi'])
# from arcus analyse_design
# Keep here? Put in marxs itself? put in separate file and define there?
def angle_covered_by_CCDs(p, n_ccds=8):
return n_ccds * 49.5 / p.meta['CIRCLE_R']
def best_ccd_pos_incl0(p, n_ccds=8):
'''One possible function to select the "optimal" detector position
Parameters
----------
p: photon list
needs to have phi_folded in it
'''
ang8 = angle_covered_by_CCDs(p, n_ccds=n_ccds)
binwidth = angle_covered_by_CCDs(p, n_ccds=0.1)
# Approximate, but good enough for this purpose
# For example, ignore +- 2.5 mm offset between channel
zeropos = np.arcsin((p.meta['D_CHAN'] / 2) / p.meta['CIRCLE_R'])
# Look at region +- 8 CCDS from the zeros order, because we definitely
# want that in the range.
# Don't go +/- 8 CCDs, but a little less, so zeroth order is never
# exactly on the edge of detector
bins = np.arange(
# Don't go below 0 because that's where we fold
max(0, zeropos - (ang8 - binwidth)),
zeropos + (ang8 + binwidth),
binwidth)
hist, bin_edges = np.histogram(p['phi_folded'], weights=p['probability'], bins=bins)
signal = np.cumsum(hist)
signal8 = signal[int(n_ccds / 0.1):] - signal[:-int(n_ccds / 0.1)]
return bins[np.argmax(signal8)]
def make_det_scenarios(p):
pdisp = p[p['order'] != 0]
# Scenario 0: All photons
det_scenarios = [{'phi_start': 0., 'phi_stop': .2,
'scenario_name': 'all_photons'}]
# Scenario 1: Maximize highly dispersed photons
phistart = best_ccd_pos_incl0(p[p['circ_phi'] > 0])
det_scenarios.append({'phi_start': phistart,
'phi_stop': phistart + angle_covered_by_CCDs(p),
'scenario_name': '8 CCDs: high-res'})
# Scenario 1b: Maximize highly dispersed photons
phistart = best_ccd_pos_incl0(p[p['circ_phi'] > 0], n_ccds=10)
det_scenarios.append({'phi_start': phistart,
'phi_stop': phistart + angle_covered_by_CCDs(p, n_ccds=10),
'scenario_name': '10 CCDs: high-res'})
# Scenario 2: Maximize dispersed photons
phistart = best_ccd_pos_incl0(pdisp)
det_scenarios.append({'phi_start': phistart,
'phi_stop': phistart + angle_covered_by_CCDs(p),
'scenario_name': '8 CCDs: all'})
# Scenario 3: Maximize dispersed O VII photons
po7 = pdisp[(pdisp['wave'] > 21.6) & (pdisp['wave'] < 28.01)]
phistart = best_ccd_pos_incl0(po7)
det_scenarios.append({'phi_start': phistart,
'phi_stop': phistart + angle_covered_by_CCDs(p),
'scenario_name': 'max O VII'})
# Scenario 4: Maximize band 33-40 Ang
#po7 = pdisp[(pdisp['wave'] > 33.7) & (pdisp['wave'] < 40.01)]
#phistart = best_ccd_pos_incl0(po7)
#det_scenarios.append({'phi_start': phistart,
# 'phi_stop': phistart + angle_covered_by_CCDs(p),
# 'scenario_name': 'max 33.7-40 Ang'})
return det_scenarios
from marxs.analysis.gratings import (effectivearea_from_photonlist,
resolvingpower_from_photonlist)
def load_simulation(filename):
evt = table.Table.read(filename)
ind = np.isfinite(evt['circ_phi']) & np.isfinite(evt['order']) & (evt['probability'] > 0)
p = evt[ind]
add_phifolded(p)
p['wave'] = p['energy'].to(u.Angstrom, equivalencies=u.spectral())
return p
import warnings
orders = np.arange(-15, 4)
results = []
for f in filelist:
p = load_simulation(f)
zeropos = np.median(p[p['order'] == 0]['circ_phi'])
for s in make_det_scenarios(p):
p_scenario = p[(p['phi_folded'] > s['phi_start']) & (p['phi_folded'] < s['phi_stop'])]
p_wave = p_scenario['wave', 'probability', 'circ_phi', 'order'].group_by('wave')
for wave, group in zip(p_wave.groups.keys, p_wave.groups):
wave = wave[0]
# see https://github.com/astropy/astropy/issues/13281
# Since we know the warning is a bug, let's ignore it.
with warnings.catch_warnings():
warnings.filterwarnings(action='ignore', category=UserWarning)
aeff = effectivearea_from_photonlist(group, orders, group.meta['N_PHOT'], group.meta['A_GEOM'] * u.cm**2)
res, pos, std = resolvingpower_from_photonlist(group, orders, col='circ_phi', zeropos=zeropos)
results.append({'filename': f,
'scenario': s['scenario_name'],
'phirange': (s['phi_start'], s['phi_stop']),
'wave': wave,
'aeff': aeff,
'res': res} |
{k : p.meta[k] for k in ['TORUS_R','CIRCLE_R','MAX_F','BLAZE','D_CHAN', 'GRATTYPE']},
)
/Users/guenther/mambaforge/envs/Arcus/lib/python3.10/site-packages/numpy/core/fromnumeric.py:758: UserWarning: Warning: 'partition' will ignore the 'mask' of the MaskedColumn. a.partition(kth, axis=axis, kind=kind, order=order)
tab = table.Table(rows=results)
In the simulations, I used a detector that covered the entire Rowland circle. That allows us to study the best position for the CCDs after the fact without rerunning time-consuming Monte-Carlo simulations.
Here, I study five different detector placement algorithms:
1) all_photons: Detector is a larger number of CCDs, such that every photons gets counted 2) 8 CCDs: 8 CCDs per camera, for a total of 16 CCDs. The location of each camera (each with 8 CCDs) is limited to locations that guarantee that the zero order is detected. Under that condition, it is then moved around to optimize the total number of dispersed orders with relatively high resolving power (i.e. those $\phi > 0$ in the plots below). 3) Same as (2), but with 10 CCDs per camera. 4) Same as (2), but using all dispersed photons, also those with low $R$ relatively close to the optical axis (order +/-1, +/-2, etc.). This potentially increases $A$ and the cost of loosing some highly-dispersed photons and thus reduced $R$. 5) max O VII: like 8 CCDs, but optimizing just for the number of photons in a band close to the O VII triplet
I can easily add more strategies and re-run the plots below for example to optimize for different energy bands, but these algorithms cover a range of probabilities that are reasonable for Arcus.
The input spectrum is list of discrete energies over the Arcus bandpass; the main purpose here is not to simulate a real spectrum, but to cover the bandpass well enough to calculate $R$ and $A$ with a sufficient number of counts. That means that "maximizing the number of counts" might result in different best detector position depending on the spectrum of the astrophysical object observed. However, most of the dispersed photons fall into the blaze peak, so that this is not a strong dependence and thus is sufficient for this study.
# Look at the last file that's still in memory.
# Just to check that everything is normal, the format looks OK etc.
def plot_phi_hist(p, orders_to_plot = [3, 2, 1, 0, -1, -2, -3, -4, -5, -6, -7, -8, -9, -10, -11]):
fig, ax = plt.subplots(nrows=2, sharex=True, height_ratios=[1, 4],
gridspec_kw={'hspace': 0})
for i, s in enumerate(make_det_scenarios(p)):
out = ax[0].plot([s['phi_start'], s['phi_stop']], [i, i], label=s['scenario_name'], lw=8)
ax[0].plot([-s['phi_start'], -s['phi_stop']], [i, i], label='_no_legend_',
color=out[0].get_color(), lw=8)
ax[0].text(.21, i, s['scenario_name'])
#ax[0].legend()
ax[0].set_axis_off()
cmap = plt.get_cmap('nipy_spectral')
color = cmap(np.linspace(0, 1, max(np.abs(orders_to_plot)) + 1))
colors = [color[abs(i)] for i in orders_to_plot]
out = ax[1].hist([p['circ_phi'][p['order']==i] for i in orders_to_plot],
color=colors,
bins=100, stacked=True, label=orders_to_plot)
ax[1].legend()
ax[1].set_xlabel('angle on Rowland circle (rad)')
out = ax[1].set_ylabel('counts / bin')
return fig
p.meta['BLAZE'], p.meta['D_CHAN'], p.meta['GRATTYPE']
(2.2, 750.0, '5.7-50-Si')
fig = plot_phi_hist(p)
For one example, this plot shows how photons are distributed on the Rowland circle (blaze angle 2.2 deg, DCHAN=750 mm, grating type 5.7-50-Si). The input spectrum is list of discrete energies over the Arcus bandpass; the main purpose here is not to simulate a real spectrum, but to cover the bandpass well enough to calculate $R$ and $A$ with a sufficient number of counts.
Positive orders are to the left of the zeroth order, negative orders on the right. The colored bars on the top of the plot indicate which areas are covered by CCDs for different CCD placement scenarios. Arcus has two cameras with a gap in between, thus there are two strips symmetric around $\phi=0$.
The zeroth order is found around -0.06 rad, because the circle is centered on the geometrical center of the instrument in the middle between the optical axis of the separate channels. In the case of the single channel simulated here, the zeroth order is to the left of that center by about the same distance as the blaze peak - the location where the brightest diffracted orders are seen. The simulation is run for a few discrete wavelengths; since the diffraction angle depends on wavelength e.g. order -4 is found at different locations.
In this example, the CCD are all in a very similar location, but that might be different in other cases, an example of which is shown in the plot below. In both cases, it is clear that using a larger number of CCDs will capture more of the wings of the blaze peak and thus provide more effective area. While most of the photons are in the blaze peak, the difference can matter a lot more for very specific wavelength.
f = tab[(tab['D_CHAN'] == 700.) & (tab['GRATTYPE'] == '4-70-Si') & (tab['BLAZE'] == 1.8)]['filename'][0]
fig = plot_phi_hist(load_simulation(f))
This example (blaze angle 1.8 deg, DCHAN=7500 mm, grating type 4-70-Si) shows how different strategies to place the detectors can impact the effective area: There is a relatively large number of photons in low orders close to the zeroth order and the "8 CCDs: all" scenario shifts the CCDs to catch those at the cost of missing the far end of the blaze peak.
tab['0 order covered'] = tab['aeff'].data[:, orders==0].flatten() > 0
### average by order number
high_res = orders < -3
grating_aeff = tab['aeff'].data[:, high_res]
tab['aeff'].data[:, orders < -2]
tab['Aeff (high res)'] = grating_aeff.sum(axis=1)
tab['R (high res)'] = np.ma.average(np.ma.masked_invalid(tab['res'].data[:, high_res]),
weights=grating_aeff / grating_aeff.sum(axis=1)[:, None],
axis=1)
### by R > 1000
highres = tab['res'].data > 1000
res_arr = np.ma.masked_array(tab['res'].data, mask=~highres)
aeff_arr = np.ma.masked_array(tab['aeff'].data, mask=~highres)
tab['A (R > 1000)'] = aeff_arr.sum(axis=1)
tab['R (R > 1000)'] = np.ma.average(res_arr,
weights=aeff_arr / aeff_arr.sum(axis=1)[:, None],
axis=1)
/var/folders/r7/f0qh27rn207bwwvt3s7v5gh40000gn/T/ipykernel_18584/712701851.py:10: RuntimeWarning: invalid value encountered in divide weights=grating_aeff / grating_aeff.sum(axis=1)[:, None],
# Conversion from energy to wave leaves values like 14.9999999999999997
# Let's fix that.
tab['wave'] = np.around(tab['wave'], decimals=3)
All simulations are for one channel only, so $A_\mathrm{eff}$ of all four channels together will be four times larger, except at the edges (due to channels being displaced by a few mm with respect to each other) or when it hits a chip gap.
The simulations are run for a grid over several parameters and we have multiple ways to slice and dice the results. We can, for example, look at one particular grating type and show how the efficiency changes with blaze angle and torus parameters at a particular wavelength. Or, we could look at how the efficiency for a given blaze angle depends on the wavelength (the classic effective area curve). Below, I'm going to explore a few of those options, but this is not a complete set of all possibilities.
Our dataset has the following main parameters
And, reminder: There are other parameters that I have left unchanged over all simulations, namely the set-up of the SPOs. I'm taken a realistic design, but not necessarily the design that Arcus will end up with, e.g. I'm using row 1 from Athena here; I have not changed the coating of the SPOs, etc.
We also need to handle the issue that we'll see different orders for more wavelength. Those orders will have different R and Aeff, yet for plots that compare other parameters we need to simplify and not plot Aeff and $R$ for each order separately. So, I'm making an effective-area weighted average of the $R$ values for all the high-resolution orders. It is not a good idea to simply average over all orders, because in some cases, we see order 0, -1, and -8, where $R$ in -8 is orders of magnitude better than in orders -1 and 0. However, what makes an order "high resolution" such that it should be included in this average? For long wavelength ($ > 15$ Ang or so), typically "high-resolution" means higher orders or "orders with $R>1000$" so that's what I'm using for the plots below. However, that gives some odd results at some wavelengths, e.g. if we have two orders that contribute equal effective area with $R=1010,1200$ and a small change in parameters make the resolution drop a little, we might end up with $R=950, 1150$. Now, the first of those orders is no longer considered "high-resolution" and not part of the average, so the average $R$ apparently jumps from $\frac{1010 + 1200}{2}=1105$ to $1150$. Similarly, a small change might make one high-resolution order drop off the detector, causing a sudden jump in $R$ and $A_\mathrm{eff}$.
I will first show a few cuts through this simulated data to explain how to read it and then provide an interactive display allowing you to explore all the scenarios I simulated.
# Pandas dataframes can't have multi-D columns, but they have nifty build-in plotting
# So, make a dataframe for those columns that are not multi-D
df = tab.copy()
df['phistart'] = tab['phirange'].data[:, 0]
df['phistop'] = tab['phirange'].data[:, 1]
df.remove_columns(['aeff', 'res','phirange'])
df = df.to_pandas()
out = sns.relplot(
data=df[(df.D_CHAN == 700.) & (df.GRATTYPE == '4-70-Si')],
x='wave', y='A (R > 1000)', col="GRATTYPE", hue='BLAZE',
kind='line', errorbar=lambda x:(x.min(), x.max()), linewidth=0
)
Effective area depending on wavelength for a 4-70-Si at D_CHAN == 700 mm for different blaze angle. For each blaze angle, the shaded area marks the region from the CCD placement scenario that gives the best value for $A$ (the "all_photons" scenario with > 30 CCDs) to the one the gives the least area (one of the scenarios with 8 CCDs per camera).
The width of the strips shows that we are loosing up to 40% of the effective area when the blaze peak is not covered well with CCDs.
In the following, we will concentrate on the "8 CCDs (high-res)" scenario that optimized coverage of the blaze peak with 8 CCDs per camera, but we need to keep in mind that, depending on the configuration we choose in the end, we might add significant effective area by having more CCDs.
out = sns.relplot(
data=df[(df.wave == 20.0) & (df.GRATTYPE == "4-60-Si")],
x='R (R > 1000)', y='A (R > 1000)', hue="BLAZE", col="scenario", size='D_CHAN',
col_wrap=3, height=3,
)
R and Aeff at 20 Ang for different detector placements. When all photons are detected (left), $R$ simply increases with the blaze angle. The larger the blaze angle, the further out the blaze peak; in other words, photons get diffracted into higher orders. Because the dispersed photons are further away from the optical axis in higher orders, $R$ increases approximately linearly with the blaze angle (in degrees). The effective area $A$ increases at first, until it hits an optimum around 1.6 deg and then increases again. D_CHAN is the physical distance between channels. If it is too small, the SPOs of different channels overlap, if it's too large the channels won't fit within the front assembly. The number given is measured in mm center-to-center for SPO petals, which is the distance of the two optical axis. For the "all_photons" scenario, D_CHAN has only very small influence on $R$ and $A$, but if the number of CCDs is limited, photons may fall of the detector. For small D_CHAN, the detectors need to be close together to make sure that all zeroth orders land on a CCD. At large blaze angles (e.g. $>1.9$ deg), the blaze peak is so far out, that a camera cannot see the major important dispersed orders any longer and thus $A$ is much lower for small D_CHAN values as shown e.g. by the black points in the "max O VII" scneario.) Sometimes, this also lowers $R$, which is shown averaged over the visible orders, if one of the highest orders drops of the chip.
sns.relplot(
data=df[(df.wave == 20.0) & (df.D_CHAN == 650.)],
x='R (R > 1000)', y='A (R > 1000)', hue="BLAZE", col="GRATTYPE", size='scenario',
col_wrap=4, height=3,
)
<seaborn.axisgrid.FacetGrid at 0x296c502e0>
Again choosing 20 Ang as an example and selecting a larger D_CHAN = 700 mm, this plot shows how the blaze angle changes the effective area. Deeper gratings (those starting with "5.7" in the name) peak at lower blaze angles and they drop-off more to larger blaze angles; for example the "5.7-40-Si-6-Ni", a 5.7 $\mu$m deep grating with 30 nm wide Si bars, coated with 6 nm of Pt) peaks at 1.2 deg blaze angle or even below that (1.2 deg is the smallest angle in this set of simulations), while the 4 $\mu$m deep gratings peak at 1.6 deg.
There are a fer parameters we can choose (within limits): The blaze angle and the channel spacing. So, for each grating type, we need to select the "best" blaze and D_CHAN and then see what we get for Aeff and R. However, as the plots above showed, choosing the "best" combination brings in some trade-off between $R$ and $A$. We can go to a higher blaze angle to increase $R$ but sacrifice some effective area if that blaze angle is larger than the optimal solution.
Fortunately, $R$ for the high-res orders depends essentially only on the blaze angle.
hue_order = [
'4-70-Si',
'4-60-Si',
'4-60-Si-6-Pt',
'4-60-Si-6-Ni',
'4-40-Si-6-Pt',
'5.0-50-Si',
'5.7-50-Si',
'5.7-40-Si',
'5.7-40-Si-6-Pt',
'5.7-40-Si-6-Ni',
'5.7-28-Si-6-Pt',
]
hue_kws = {
'linewidth': [{'4': 1, '5.0': 1.7, '5.7': 3}[h.split('-')[0]] for h in hue_order],
'linestyle': ['--' if '-6-' in h else 'solid' for h in hue_order]
}
g = sns.FacetGrid(
data=df[(df.scenario == "8 CCDs: high-res") & (df.D_CHAN == 650.)],
col="BLAZE", hue='GRATTYPE', hue_order=hue_order, hue_kws=hue_kws,
height=3
)
out = g.map_dataframe(sns.lineplot, x='wave', y='R (R > 1000)')
out = g.add_legend()
$R$ for different blaze angles. $R$ is shown as a weighted average over all orders with $R>1000$, i.e. all the photons in the blaze peak. A remarkable property of CAT gratings is that $R$ is essentially constant with wavelength; the fundamental reason for this is that all photons are diffracted into the blaze peak. They end up in different diffraction orders, but at very similar detector locations and thus have the same $R$. Small wiggles stem from numerics of a Monte Carlo simulation and from the fact that sometimes one of the lesser contributing (but not negligible) order is not covered by the CCDs.
There is some scatter at very low wavelength because the number of photons that get dispersed into the blaze peak is lower. In the next plot, we will show the value for $R$ at 25 Ang to avoid those problems and get an $R$ that is representative for each blaze angle.
out = sns.boxplot(data=df[(df.scenario == "8 CCDs: high-res") & (df.wave==25.)],
x='BLAZE', y='R (R > 1000)')
Resolving power vs. blaze angle (in deg). The box includes simulations for different D_CHAN and different gratings types. The boxes show the quartiles of the distribution, whiskers the full distribution, black markers any outliers.
We see that the distributions are very narrow and that $R$ depends linearly on the blaze angle. Given this plot, we can pick out the minimum $R$ that is required by the science and thus the minimum blaze angle.
For example, if we want to achieve a $R> 3500$, then we need to get to a blaze $>1.8 deg$ with some margin, since the simulation above do not yet apply an alignment budget.
Knowing the minimal blaze angle, we can now look for the best grating to use, taking into account the effective area and the TRL of the grating.
In general, smaller blaze angles give better effective area $A$.
g = sns.FacetGrid(
data=df[(df.scenario == "8 CCDs: high-res") & (df.D_CHAN == 650.)],
col="BLAZE", hue='GRATTYPE', hue_order=hue_order, hue_kws=hue_kws,
height=3, col_wrap=3
)
out = g.map_dataframe(sns.lineplot, x='wave', y='A (R > 1000)')
out = g.add_legend()
Effective area curves for different gratings and different blaze angles. Pure Si gratings are shown with solid lines, coated gratings with dotted lines. The deeper the grating the ticker the line.
Following the example from the last plot for $R> 3500$, we are looking at blaze angles 2.0 or 2.2 degrees. Since the effective area drops for larger blazes, we select 2.0 degrees. At that wavelength, we see that coated gratings are significantly better at short wavelength, but in the prime Arcus bandpass, the pure Si gratings perform better. The deeper the grating, and the thinner the bars, the better the effective area.
The simulations above concentrate on a few parameters that are relevant to choose the best grating. While the number for $A$ is reasonable, there are a number of effects not included, because they affect each grating in the same way. When they are taken into account, the relative ordering of gratings stays the same, but the absolute value of $A$ might change.