Welcome to Max Tegmark's CMB data analysis center

This site contains links to cosmic microwave background resources across the globe, focusing on data analysis, and is up-to-date through about year 1999. Click on any part of the data analysis pipeline below for details. For theoretical aspects, see the twin site, Wayne Hu's theory center. See also the experiment overview page.


The figure above schematically illustrates the many steps involved in producing future high-precision constraints on cosmological models. If non-Gaussian CMB fluctuations are present (as in say a topological defect model), then additional processing of the CMB map will be desirable, since the phases will contain information in a addition to that in the power spectrum. The various steps are discussed in more detail below.

The microwave sky

At microwave frequencies, the sky contains a gold mine of information about the early Universe, since the cosmic microwave background (CMB) fluctuations depend sensitively most cosmological parameters - click here for an online review. Unfortunately, this cosmic background signal is contaminated by a various types of foreground emission - a recent review is given in Section 4 of Tegmark & Efstathiou (1996)

Measurement (experimental design issues)

Click here for a summary of all CMB experiments today, with links. When planning and designing experiments with a finite budget, one obviously wants to

The raw data

Before the raw measured data set can be used to do science, it must of course be carefully calibrated, cleaned and checked for systematic problems, as described below.

Calibration and cleaning

Although perhaps the most unglamorous step in the pipeline, this is often the most time-consuming step as well. For a rather lengthy list of potential problems to worry about, see Tom Herbig's Little Shop of Horrors. An instructive case study (for COBE) is Kogut etal (1996), and ways of facilitating systematic error detection and removal by means of a good pointing strategy are discussed by Wright (1996).

Time-ordered data

The result of this gruesome process is the time-ordered data set (TOD). For future satellite missions, it might be contain as many as 10,000,000,000 numbers. For a total-power measurement, these numbers are simply the positions and temperatures of all the pixels observed, in chronological order, in each channel. For single-difference experiments (like COBE and MAP), the TOD consists of pairs of pixel positions and the temperature difference. For more general chopping schemes, each temperature in the TOD is some linear combination of the temperature across the sky. The cosmological parameters can be measured with the smallest error bars possible by performing a brute force likelihood analysis on the TOD - in principle. In practice, this is numerically unfeasible for large data sets. Analysis of large future data sets will therefore necessarily involve the intermediate step of reducing the TOD to maps.


Map-making offers a convenient way to distill the cosmological information from the TOD into a much smaller data set. In addition, maps are of course useful for comparing different experiments with one another, for subtracting foregrounds, and to look for spatial features in the data (e.g., systematic problems and non-gaussian signals). By linearity, the TOD data vector y can be written y=Ax+n, where the vector x contains the temperatures in each of the pixels in the map, the vector n denotes the noise and A is some known matrix determined by the pointing strategy. In the past, at least four different map-making methods have been employed for estimating the map x from the TOD y. Ten map-making methods are compared by Tegmark (1997), and it is found that several of them (the COBE method and various variants of Wiener filtering) have the nice property that they retain all the cosmological information from the TOD. This means that the parameters can be measured just as accurately from the map(s) as from the full TOD. The conclusion of this paper is that the TOD should be reduced to maps using the COBE method, since

Multi-frequency maps

The result of the map-making step is a number of sky maps at different frequencies (those from different channels at the same frequency are of course combined after appropriate systematics checks of the difference maps). For the subsequent analysis, it is often desirable to include maps at additional frequencies from other experiments as well - both from other CMB experiments and from "foreground experiments" such as e.g. DIRBE and IRAS. Maps in corresponding to the "electric" and "magnetic" parts of the the polarization field (Kamionkowski et al 1996, Zaldarriaga et al 1996) are likely to be useful as well.

Foreground removal

To remove contaminating foreground signals, we can can take advantage of all ways in which they differ from the CMB signal:

A detailed discussion of all these issues is given in Tegmark & Efstathiou (1996), and detailed simulations can be found in Brandt etal (1994). The bottom line is that for future multichannel experiments like MAP and PLANCK (ne COBRAS/SAMBA), an easy-to-implement subtraction scheme can eliminate all foregrounds to a an accuracy of much better than a percent if current foreground estimates are OK . Unfortunately, this is still a big if, as we lack e.g. accurate point source counts between 20 and 200 GHz.

CMB sky map

The result of the foreground removal is the merging of all channels into a single map. As of January 20, 1997, two-dimensional CMB maps have been published for these experiments:

Power spectrum estimation

Why do it?

If the statistical properties of the CMB fluctuations are isotropic and Gaussian (which they are in the standard inflationary models), then all the cosmological information in a sky map is contained in its power spectrum C_l (the variance of its spherical harmonic coefficients, corrected for beam smearing). This means that all the information from even a giant data set (say a map with n=10^7 pixels) can be reduced to just a couple of thousand numbers, greatly facilitating parameter estimation (the next step in the pipeline). Indeed, for future experiments, it has been argued that this data compression step will be necessary to make parameter estimation numerically feasible, just as the TOD must be compressed (via mapmaking) to make power spectrum estimation numerically feasible.

Why not compute some other statistic instead?

The power spectrum has emerged as the standard way to present experimental results in the literature since it has several advantages over say the correlation function:

How do it?

There are two reasons for why a straightforward expansion in spherical harmonics is not the best way to measure the power spectrum:

  1. One always has incomplete sky coverage (even with satellites, the Galactic plane must be discarded).
  2. One wishes to give less weight to noisier pixels (that have been observed fewer times) in order not to destroy information.

Both of these facts spoil the orthogonality of the spherical harmonics. Any quadratic combination of pixels will, appropriately normalized, measure some weighted average of the power spectrum - the weights are known as the window function. The non-orthogonality simply means that it is impossible to obtain an ideal (Kronecker delta) window function. Instead, the best you can do is (Tegmark 1995) to get a Window function whose width is about the inverse of the smallest angular map dimension in radians, which is usually adequate for all practical purposes adequate. There is a simple power spectrum estimation method (Tegmark 1996) that has the following nice properties:

Angular power spectrum

Once computed from the data, the power spectrum can be used to constrain cosmological models.

How compute it theoretically?

Wayne Hu's online review gives an intuitive physical explanation of the physical origin of the features of the power spectrum. Another must-see are the Berkeley movies, showing how the power spectrum changes as you alter the parameters. For purely pragmatic purposes, the bottow line is revealed by the figure above:

How compute it in practice?

By using "the fastest Boltzmann code in the west", written by Seljak & Zaldarriaga (1996). You can download it from the CMBFAST web site. It computes a typical spectrum in a minute or two - about 100 times faster than previous codes.

Model testing & parameter estimation

For future high-precision CMB experiments, parameter estimation with a simple chi-squared model fit to the observed power spectrum will give virtually the smallest error bars possible (section 5.4 in Tegmark 1996). For smaller data sets which produce weaker constraints, more accurate results can be obtained from a brute-force likelihood analysis of the sky map, as was done for COBE (Tegmark & Bunn 1995, Hinshaw et al 1996). The big caveat to all of this is that in a non-Gaussian model (such as cosmic strings or textures), the power spectrum is only part of the story, and additional information can be extracted from the phases of the map. The question of how to proceed in such a case is still wide open - for some early work on the subject, see Kogut etal (1996) and Ferreira & Magueijo (1996).

Estimates of cosmological parameters

If a model similar to standard Cold Dark Matter (CDM) turns out to be correct, future CMB missions should be able to measure key cosmological parameters to an accuracy of a few percent or better ( Jungman etal 1996; Bond, Efstathious & Tegmark 1997; Zaldarriaga, Seljak & Spergel 1997). It is important to remember that although certain projects (such as trying to determine the shape of the inflaton potential by measuring the spectral index n, the tensor spectral index and the tensor-to-scalar ratio) require previously untested assumptions about untested high-energy physics, there are many other parameters that can be measured in a robust fashion by assuming little else than that we understand gravity and the behavior of hydrogen and photons at a few thousand degrees ( Hu & White 1996). For instance, the spacing between the power spectrum peaks provides a fairly clean probe of the angle-distance relationship (which fixes a certain combination of Omega and Lambda). Whatever the true power spectrum turns out to look like, it is likely to help us clean up among the profusion of cosmological models that are currently on the market.


Return to my home page
This page was last modified 1999.