*In Arcus, several diffraction orders fall onto the same position on the detector and detected photons need to be assigned to a grating order based on the energy that the CCD determines (order-sorting). Because there are more orders and orders are closer together than in, e.g., Chandra/HETGS or XMM/RGS, orders actually overlap in energy. In this notebook, I give an overview how order sorting works, how exactly we can define the order-sorting regions for Arcus, and show that we can expect up to 20% contamination from other orders in an extracted spectrum.*

In [14]:

```
from nbtemplate import display_header, get_path
display_header('OrderSorting.ipynb', status='Arcus Probe')
```

Last revision in version control:

- Author: H. Moritz Günther hgunther@mit.edu
- commit bbf2c9206e2dbc2ea62cbc4c581c63cfe8a8a6a4
- Date: Wed Mar 1 10:26:07 2023 -0500

This document is git version controlled. The repository is available at https://github.com/hamogu/arcus. See git commit log for full revision history.

Code was last run with:

- MARXS ray-trace code version 1.3.dev109 (commit hash: 675e82d) (Date of dirty version: 20230228)
- ARCUS CALDB version 3bec96a (2023-02-15 18:14:20)

In Arcus, the CAT gratings disperse the light into different orders and it depends on the wavelength which order is the most efficient. The gratings are blazed, which means that the photons hit the grating bars at an angle (the blaze angle) and the photons are preferentially diffracted into orders close to the blaze angle. For the Arcus design, this means that we can cover many orders with a limited number of CCDs, because most of the photons are diffracted to the same area on the detector. On the other hand, we need to make use of the CCD energy resolution to decide which photon belongs to which order.

In principle, this is the same process as for Chandra/HETG or XMM/RGS, except that Arcus has more orders and they are close together in energy, which means that photons from one order will sometimes be errornously be detected as photons from a different order and thus appear at the wrong wavelength in the extracted spectrum.

How bad exactly contamination from neighbouring orders is depends on the science goal and the spectrum of the sources. Detailed simulations are the best way to answer that question. The purpose of this notebook is to conceptually explain what's going on.

In general, contamination due to insufficient order-sorting is not a problem for the study of emission lines. For example, to get a radial-velocity curve of a coronal source, we only need to measure the position of one or a few strong emission lines. We roughly know where to look for those lines and might, for example, choose to study the O VIII line at 18.97 Å. We will be able to identify this particular line in order -5 even if the order shows a few weaker lines that should not be there. For example the O VII resonance line at 21.45 Å in order -4 will also contribute some photons to order -5, where they would be interpreted as $21.45 * 4/5=17.16$ Å photons. (The grating equation is $m\lambda = d \sin\theta$. At a given position on the detector, i.e. given $\theta$ we need to know $m$ to assign the true wavelength. If we use $m=5$ instead of $m=4$ because of insufficient order-sorting, we will find the O VII line at a position that we associate with 17.16 Å.)

Combining a-prioriy knowledge of the lines expected in a coronal source and comparing different diffraction orders (e.g. -5 has weak contamination from -4 and -6. If we expect a line to be contamination from -4, then that particular line should be strong in -4 and weak again in -3; if it was contamination from -6, then it should be strong in -6 and weak in -7) we will be able to identify all features. Using the proper ARFs, corrected for the effects of order-sorting, we can also perform a formal fit.

The situation is different for an analysis of absorption lines. Searching for a weak absortion line at an unknown position, we essentially have to see the dip in the spectrum. Contamination from neighboring orders makes this task much harder, because it increases the level of the continuum. The continuum in order -5 is now a combination of order -4 (weak contribution), -5 (stronger contribution), and -6 (weak contribution). An absortion line in order -5 thus appears weaker, because it is now swamped by the additional continuum from orders -4 and -6, acting essentially as an additional "background". Because of this background, the Poisson fluctuations in the continuum are higher and a deeper absorption line is required for a significant detection. In these situations, it can be beneficial to choose a narrower order-sorting region, even if that reduces the overall count rate.

In [15]:

```
from os.path import join as pjoin
import numpy as np
import astropy.units as u
import matplotlib.pyplot as plt
from astropy.visualization import quantity_support
quantity_support()
from marxs.reduction import osip, ogip
from marxs.missions.arcus import arfrmf
from marxs.analysis.plots import OrderColor
from marxs.missions.arcus.ccd import CCDRedist, ccdfwhm
ccdredist = CCDRedist()
%matplotlib inline
```

In [16]:

```
oc = OrderColor(max_order=12)
```

The grating equation is $$m\lambda = d \sin\theta$$ where $m$ is the diffraction order, $\lambda$ is the wavelength, $d$ is the grating period, and $\theta$ is the diffraction angle. Since we know the x/y position on the detector for each detected photon and the geometry of the instrument, we can calculate the angle $\theta$ for each photon. Since we also know $d$, it is often convenient for order-sorting to describe the position of a photon as $m\lambda$. Photons of 60, 30, 20, and 15 Å, diffracted into order 1, 2, 3, and 4 respecively, fall on the same position on the detector, because they have the same value for $m\lambda$. Note that in Arcus the most common orders are labelled with negative numbers.

One way to look at the order-sorting is to display the $m\lambda$ coordinate along the x-axis and the CCD energy along the y-axis. Because of their shape, these plots are colloquially known as "banana plots".

It is easy to analytically calculate the difference in CCD energy between the different orders for any given value of $m\lambda$ (which corresponds to a given location on the CCD): With $m \lambda = d \sin\theta$ and the photon energy $E = h c / \lambda$, we can compare the energies of two photons at the same $d \sin\theta$, but with different orders $m$ and $n$:

$$ d \sin\theta = \frac{m h c}{E_m} = \frac{n h c}{E_n} $$We can now calculate the difference:

$$\Delta E = E_m - E_n = E_m - E_m \frac{n}{m} = \frac{E_m}{m} (m - n)$$For a given detector position, the energy difference between two consecutive orders is constanst. Re-writing the energy as $m\lambda$ again, we can see that $\Delta E$ is largest for small values of $m\lambda$, closer to the zeroth-order:

$$\Delta E = \frac{hc (m-n)}{m\lambda_m}$$The fact that the orders are further apart on one side of the banana plot than on the other one means that order sorting will work better on one side of the detector, while the orders overlap more on the other side.

In [17]:

```
grid_neg_out = np.arange(80, 160) * u.Angstrom
```

In [18]:

```
fig, ax = plt.subplots()
for o in np.arange(-1, -12, -1):
en = (grid_neg_out / np.abs(o)).to(u.keV, equivalencies=u.spectral())
ax.plot(grid_neg_out, en, label=o, **oc(o))
ax.set_xlabel('$m\lambda$ [{}]'.format(ax.get_xlabel()))
ax.set_ylabel('CCD energy [{}]'.format(ax.get_ylabel()))
out = ax.legend()
```

Lines in a "banana plot", showing the relation between $m\lambda$ and photon energy. Every value of $m\lambda$ corresponds to a specific position on the detector.

In a real CCD the energy of each photon is imperfectly measured. For the Arcus CCDs, we predict a FWHM around 70 eV for the soft X-ray range. That means that not all photons that belong to a specific order land exactly on the line in the order-sorting plot. Instead, they are distributed in energy.

In [19]:

```
from sherpa.models import NormGauss1D
```

In [20]:

```
fig, ax = plt.subplots()
funcs = []
en_grid = np.arange(0., 1.5, 0.01)
for o in np.arange(-1, -12, -1):
en = (120 * u.Angstrom / np.abs(o)).to(u.keV, equivalencies=u.spectral())
func = NormGauss1D(str(o))
func.pos = en
func.FWHM = np.interp(en, ccdfwhm['energy'],
ccdfwhm['FWHM'] / 1000) # eV to keV
funcs.append(func)
ax.fill_between(en_grid, func(en_grid), label=o, alpha=.5, **oc(o))
ax.axvline(en.value, **oc(o))
ax.set_ylabel('Relative contribution')
ax.set_xlabel(f'CCD energy [{en.unit.to_string(format="latex_inline")}]')
funcs = sum(funcs[1:], funcs[0])
ax.plot(en_grid, funcs(en_grid), 'k', lw=4, label='sum')
ax.set_ylim(0, None)
out = ax.legend()
```

This plot is a for a specific value of $m\lambda=120$ Å. The vertial lines mark the exact energy that a photon of that order should have at $m\lambda=120$ Å. This is the value that can be read off the colored curved in the figure just above for $m\lambda=120$ Å. The colored distributions show the probability density of detecting a photon of a specific order at a specific energy, or in other words, they show the energy distribution that we will observe form a large number of photons in an order. Because these distributions are wider than the distance in energy between two consequtive orders, they overlap. Thus, a photon that arrives at this detector position and has a detected energy of 0.45 keV could be either an order -4 photon or an order -5 photon. So, when we extract the order -5, we need to make a choice: Will this photon be extracted or not?

Note that in this plot, every order is shown on the y-axis at the same height to explain the concepts applied here. In practice, the height of each order depends both on the characteristics of Arcus, e.g. the quantum efficiency of the CCD changes with energy, and on the spectrum of the source, e.g. if the source does not emit any photons below 0.5 keV, then there will be no signal in order -1 to -5 and therefore also no contamination of the extracted order -6 by order -5 photons.

In [21]:

```
fig, ax = plt.subplots()
funcs = []
en_grid = np.arange(0.4, 7, 0.001)
ocn = OrderColor(max_order=6)
for o in np.arange(-4, -7, -1):
en = (120 * u.Angstrom / np.abs(o)).to(u.keV, equivalencies=u.spectral())
func = NormGauss1D(str(o))
func.pos = en
func.FWHM = np.interp(en, ccdfwhm['energy'],
ccdfwhm['FWHM'] / 1000) # eV to keV
funcs.append(func)
ax.fill_between(en_grid, func(en_grid), alpha=.5, **ocn(o))
ax.axvline(en.value, **ocn(o), label=o)
ax.set_ylabel('Relative contribution')
ax.set_xlabel(f'CCD energy [{en.unit.to_string(format="latex_inline")}]')
funcs = sum(funcs[1:], funcs[0])
ax.plot(en_grid, funcs(en_grid), 'k', lw=4, label='sum')
ax.set_ylim(0, 17)
for i, (myosip, label) in enumerate([(osip.FixedWidthOSIP(30 * u.eV, ccd_redist=ccdredist), 'osip60'),
(osip.FixedWidthOSIP(40 * u.eV, ccd_redist=ccdredist), 'osip80'),
(osip.FractionalDistanceOSIP(fraction=1, ccd_redist=ccdredist), 'osiptouch'),]):
for o in [4,5,6]:
ax.plot(myosip.osip_range([120] * u.Angstrom / o, o).to(u.keV), np.array([14, 14]) + i,
lw=5, c=ocn(o)['color'])
ax.legend()
out = ax.set_xlim(0.4, 0.65)
```

This plot is similar to the above, but zooms into a narrower energy region. The colored bars on the top mark the energy regions that will be extracted for each order. The three sets of bars at $y=16$, $y=15$, and $y=14$ show different possible ways to define the order sorting regions. The top bars at $y=16$ are for a scenario where the extraction regions for the orders touch. This maximizes the number of photons extracted because each photon will be assigend to exactly one extracted spectrum. The cost for this is that an order -5 spectrum will have significant contributions from order -4 and -6, which can complicate spectral fitting. On the other hand, a narrower region like the bottom of the three bars (at y=14) is for a scenario where only photons within 30 eV of the expected energy are extracted. One can see that this leads to a considerable loss of photons, because the integrated probability for a photon to land in the extracted regions is less than 60%; in other words, this choice of extraction region reduces the effective area of Arcus by more then 40%. However, it also reduces contributions from order -4 and -6.

In [22]:

```
orders = np.arange(-11, 0)
grid = np.arange(80, 160) * u.Angstrom
for thisosip, outd in [(osip.FixedWidthOSIP(30 * u.eV, ccd_redist=ccdredist), 'Extract $\pm$ 30 eV'),
(osip.FixedWidthOSIP(40 * u.eV, ccd_redist=ccdredist), 'Extract $\pm$ 40 eV'),
(osip.FractionalDistanceOSIP(fraction=1, ccd_redist=ccdredist), 'Extract every photon once'),
(osip.FractionalDistanceOSIP(fraction=.7, ccd_redist=ccdredist), 'Almost touching extraction regions')
]:
fig, axes = plt.subplots(ncols=2, figsize=(8, 4))
for order in orders:
thisosip.plot_osip(axes[0], grid, order,
**oc(order))
# pick the middle order for plotting purposes
o_mid = orders[len(orders) // 2]
thisosip.plot_mixture(axes[1], grid, o_mid)
fig.subplots_adjust(wspace=.3)
fig.suptitle(outd, size=15)
```

*left:* Banana plots for four different definitions of the order-sorting boundaries. For each scenario, the shaded color marks the region in energy space where photons will be assigned to a particular order. If a photon is detected in a white region, it will not be assigned to an order.

*right*: For one examplary order, the right plots show the fraction of photons in the main order that will be extracted (blue) and the relative importance of the contaminating orders. When the sum of all extracted photons (black) reaches into the red shaded zone, some photons are extracted more than once, which breaks the assumption that all data bins are independent and Poisson distributed.

The four scenarios shown here are all chosen to extract the bulk of the photons, while keeping the contamination managable, but they differ in detail. The top two have extraction regions with a fixed width. Since the energy resolution of the CCD is essentially constant over the energy range shown here, that means that the fraction of photons in the main order is also constant. However, since the orders are closer together towards larger values of $m\lambda$, the contamination increases. For the $\pm 40$ eV case, the extraction regions actually overlap at the very right end of the plot.

In the third scenario, the extraction regions touch each other. That way no photon is lost and we maintain the full effective area of Arcus. The price for that is that at high values of $m\lambda$ up to one fifth of the photons extracted in each order are actually coming from a contaminating order.

The last scenario imposes some space between the orders for all values of $m\lambda$. This leads to a significant loss of effective area of up to 30% for high $m\lambda$, but also reduces the contamination by a factor of two compaed to the "extraction regions touch" scenario.

Other prescriptions for the order-sorting bounadaries are possible, but the four scenarios described here are studied in more detail. Note that photons are not evenly distributed over the orders. Because of the blaze peak, more photons are detected in the center of an order than at the left or right end.

So far, the discussion above concentrated on the relative importance for each order. In practice, more factors contribute to the effective area. Here, I will show the actual effective area calculated for Arcus for different order-sorting scenarios.

In [23]:

```
root = 'far_'
```

In [24]:

```
fig, axes = plt.subplots(ncols=3, figsize=(15, 4))
for s, c in zip(['osip60', 'osip80', 'osiptouch'], 'ryb'):
for o in [-6, -5, -7]:
arf = ogip.ARF.read(pjoin(get_path('arfrmf'), s,
root + arfrmf.filename_from_meta('arf', ARCCHAN='all', TRUEORD=o,
CCDORDER=-6, ORDER=-6)))
if o == -6:
basearf = arf
axes[0].plot(arf['BIN_LO'], arf['SPECRESP'], c=c, label=s if o==-5 else '__no_legend__')
axes[1].plot(basearf['BIN_LO'], arf['SPECRESP'], c=c, label=s if o==-5 else '__no_legend__')
if o != -6:
with np.errstate(invalid='ignore'):
axes[2].plot(basearf['BIN_LO'], arf['SPECRESP'] / basearf['SPECRESP'],
c=c, label=s if o==-5 else '__no_legend__')
axes[0].legend()
for ax in axes[:2]:
ax.set_yscale('log')
ax.set_ylim(1, None)
ax.set_ylabel('Effective area [{}]'.format(ax.get_ylabel()))
axes[0].set_xlabel('True wavelength [{}]'.format(axes[0].get_xlabel()))
for ax in axes[1:]:
ax.set_xlabel('Apparent wavelength [{}]'.format(ax.get_xlabel()))
out = axes[2].set_ylabel('strength of contaminating order')
```

*left:* Effective area for order -6 (the high, central curve) and the two contaminating orders -5 and -7 (lower curves to the left and right). All three contribute to the total effective area in the extracted spectrum, but their contributions are shown here separately. The x-axis shows the true wavelength of the photons seen in the extracted order -6. Even photons $>30$ Å contribute to the extracted -6 order spectrum.

*middle:* The same plot as on the left, but the x-axis shows the wavelength that will be assigned to the detected photons, not their true wavelength. The central curve is the same, but photons of order -5 and -7 will be assigned the wrong wavelength, since they are errornously identified as order -6 photons. Those $30$ Å photons discussed as example, will be detected around 25 Å, making the identification of any feature more difficult.

*right:* Ratio of the contaminating orders to the main order.

In contrast to the above, the curves shown here now take all contributions to the ARF into account, e.g. the chip gaps (which look like small rectangles "hanging down" from the curve), the CCD QE, and the filter transmission. Thus, the contribution of order -5 and -7 are no longer the same, unlike in the banana plots above.

All figures here show ARFs for different order-sorting scenarios. The scenario with touching extraction regions (blue) delivers the highest total effective area, but has also the highest contamination.

In [25]:

```
wavegrid = np.arange(0, 60, .1) * u.Angstrom
specresp = np.zeros(len(wavegrid)) * u.cm**2
```

In [26]:

```
from glob import glob
arflist = glob(pjoin(get_path('arfrmf'), 'osiptouch', '*.arf'))
for f in arflist:
arf = ogip.ARF.read(f)
# Interpolate on common wavelength grid. Good enough for plotting.
# Interp requires sorted input and ARFs are reverse-sorted in wavelength
specresp += np.interp(wavegrid, arf['BIN_LO'][::-1], arf['SPECRESP'][::-1])
fig, ax = plt.subplots()
ax.plot(wavegrid, specresp)
ax.set_xlabel(f'$\\lambda$ [{wavegrid.unit.to_string("latex_inline")}]')
ax.set_ylabel(f'effective area [{specresp.unit.to_string("latex_inline")}]')
ax.set_title('Combined effective area of all dispersed orders')
```

Out[26]:

Text(0.5, 1.0, 'Combined effective area of all dispersed orders')

So far, I've described order-sorting in the traditional paradigm used for e.g. Chandra/HETGS or XMM/RGS where one spectrum is extracted for each order and each spectrum has one or more associated ARFs and RMFs. Data like that can be fitted with the common tools in X-ray astronomy, like XSPEC, Sherpa, ISIS, etc., which are well tested, accepted in the field, and implement proper Poisson statstics. However, to a certain degree that can feel like putting a square peg into a round hole.

Arcus data may profit from new, advanced fitting strategies. Several approaches have been discussed in the simulations group for example:

- fractional counts, where a fraction of each count is assigned to an order (but destroys Poisson likelyhood)
- assgin each photon to one order, but choose that order randomly based on the relative likelyhood for each order at that position and CCD energy (hard to make ARF and RMF)
- fit in PHA space, i.e. instead of a number of 1D spectra, fit essentially the 2D bananaplot image.

However, each of these approaches has potential downsides and requires more development.