Version 1.0.0  

Bayesian Blocks

 

Bayesian Blocks Functions

Based upon the algorithms and MATLAB routines of Jeffrey Scargle, these routines take an event list (i.e., an ordered list of the times of detected events), and create a piecewise-constant rate representation of the lightcurve. That is, the lightcurve is represented as a series of contiguous "blocks" within which the event rate is modeled as a constant. The determination of the location of the "change points" (i.e., the boundaries between the blocks) and the rates within the blocks is made via a Bayesian analysis (assuming Poisson statistics). A user-selectable parameter assigns the prior probability for the number of blocks, which is roughly equivalent to assigning a significance level threshhold for each detected block. (The simple `frequentist' interpretation of this parameter is best used for guiding, for example, Monte Carlo simulations to gauge the true significance level.)

A description of the algorithm can be found in:

     "Studies in Astronomical Time Series Analysis. VI.
      Optimum Partition of the Interval: Bayesian Blocks,
      Histograms, and Triggers",  by J.D. Scargle et al., 2003 (in preparation)

Two routines must be called by the user. The event times are read by sitar_make_data_cells, which then creates the input (either as a series of event "cells", or bins for a binned-mode) for sitar_global_optimum. The latter outputs the locations of the statistically significant "change points" as well as the rates, counts, and associated errors within each block. These routines are described below.

  1. sitar_make_data_cells ( tt, type, max_delt, frame, tstart, tstop )
    Assign events to "cells" described by the cell population and size.
    Run as: isis> cell = sitar_make_data_cells(tt,type,max_delt,frame,tstart,tstop);

    Inputs:
    • tt : The times of the detected events
    • type : Describe cells as "events" (type=1 or 2), with the size (i.e., normalized duration) being the distance from halfway from the previous event to halfway to the next event (type=1), or from the current event to right before the subsequent event (type=2). (Note: all cell populations are greater than or equal to one, and the total number of cells is less than or equal to the total number of events.) Alternatively, events can be assigned to bins of uniform size (type=3). (For this case, cells can have zero population, and the total number of cells can greatly exceed the total number of events.)
    • max_delt : Events closer together than max_delt are grouped together in a single cell. (Note: this is for grouping of consecutive events, such that the resulting cell duration can be greater than max_delt.)
    • frame : The output cell sizes are in units of frame, or this is the bin size for type=3.
    • tstart, tstop : The start and stop times for creating the output cells.
    Outputs:
    • cell : A structure with fields:
      cell.pops - an array of the number of events per cell;
      cell.size - the size (i.e. duration) of a cell in units of frame or the mean counts per bin (derived from the total lightcurve) in binned mode (type=3);
      cell.lo_t,.hi_t - the start and stop times of the cell;
      cell.dtcor - an array, currently set to unity, for storing "dead time corrections" (or "efficiency" corrections) of the cells. Must be changed externally to this subroutine, and currently only used in sitar_global_optimum to deadtime correct the output rates of the resulting blocks.

  2. sitar_global_optimum( cell, ncp_prior, ev_type [, first, alpha, beta] )
    Find the maximum likelihood location of the change points.
    Run as: isis> results = sitar_global_optimum(cell,ncp_prior,type);

    Inputs:
    • cell : The output structure from sitar_make_data_cells. Important note: If creating `cell' by hand for `binned mode', cell.size should have the expected counts per bin presuming a uniform rate. This is essentially the identical procedure one takes when using `Bayesian Blocks' for searching for emission/absorption lines in gratings data, as described here.
    • ncp_prior : The prior probability for the number of "blocks" is taken to be: gamma^(# of blocks). Here, ncp_prior = -log( gamma ), and is very, very roughly interpretable as a significance level for each detected block. I.e., a rough proxy for significance level is 1 - exp(-ncp_prior). (Bayesians never talk about significances this way, but people who use codes do, so I bow to the pressure. A better method would be to run Monte Carlo simulations to gauge the significance, and use the above 'interpretation' to judge how many Monte Carlo trials you might need to run. Research is currently underway to help clarify and quantify 'significance levels' of the algorithm.)
    • type : Identical to type from sitar_make_data_cells.
    • first : (Optional) If first > 0, then the code is run in "trigger" mode, and returns upon the first sign of a change, so long as it is greater than 'first' cells from the beginning of the lightcurve.
    • alpha, beta : (Optional) For binned mode (type=3), the prior on the counts per bin of unit size is presumed to be proportional to (Lambda)^(1-alpha) exp(-beta Lambda), where Lambda is the count rate per unit sized bin. The default values are alpha=beta=1, but these can be modified.
      For event mode (type=1,2), different choice of alpha and beta lead to different priors when caculating the likelihood function for a block.
      1. alpha=0, beta=0: The default, with the likelihood being marginalized over a uniform distribution (0->1) of p_1, the probability that a frame has one or more events in it.
      2. alpha>0, beta=0: The prior on p_1 is taken as (alpha+1) * (1-p_1)^alpha.
      3. alpha=0, beta>0: The prior is now based upon *rate*, and is given by exp(-Lambda/Lambda_0), where Lambda_0 is the observed mean rate per frame, calculated over the whole lightcurve. Note: this is equivalent to the above, with (alpha+1) == Lambda_0^{-1}.
      4. alpha>0, beta>0: The prior is now uniform over rate, between bounds given by alpha, beta. NOTE: Requires use of the GSL module from the S-Lang Modules Packages.
      5. alpha=beta >0: Uniform rate prior, with the bounds automatically chosen to be 1/3 and 3 times the mean rate detected in an individual cell of the lightcurve. NOTE: Requires use of the GSL module from the S-Lang Modules Packages.
    Outputs:
    • results : A structure with fields results.cpt - An array of change point locations for the maximum likelihood solution (indices are specific to the cell input); results.last, .best - (Diagnostic purposes only) Arrays of the location of the last change point and the associated maximum log probability for the truncated lightcurve with its length equal to the array index; results.cts - Counts in each block; results.rate - Rate in each block; results.err - (Gherels) poisson error for the rate in each block; results.lo_t, .hi_t - Times of lower (>=) and upper (<) block boundaries.

This page was last updated Aug 22, 2011 by Michael Nowak. To comment on it or the material presented here, send email to mnowak@space.mit.edu.
Valid HTML 4.01! Made with JED. Viewable With Any Browser.