|
|
|
Bayesian Blocks Functions
Based upon the algorithms and MATLAB routines of Jeffrey Scargle, these
routines take an event list (i.e., an ordered list of the times of
detected events), and create a piecewise-constant rate representation
of the lightcurve. That is, the lightcurve is represented as a series
of contiguous "blocks" within which the event rate is modeled as a
constant. The determination of the location of the "change points"
(i.e., the boundaries between the blocks) and the rates within the
blocks is made via a Bayesian analysis (assuming Poisson
statistics). A user-selectable parameter assigns the prior probability
for the number of blocks, which is roughly equivalent to assigning a
significance level threshhold for each detected block. (The
simple `frequentist' interpretation of this parameter is best used for
guiding, for example, Monte Carlo simulations to gauge the true
significance level.)
A description of the algorithm can be found in:
"Studies in Astronomical Time Series Analysis. VI.
Optimum Partition of the Interval: Bayesian Blocks,
Histograms, and Triggers", by J.D. Scargle et al., 2003 (in preparation)
Two routines must be called by the user. The event times are read by
sitar_make_data_cells, which then creates the input (either as
a series of event "cells", or bins for a binned-mode) for
sitar_global_optimum. The latter outputs the locations of the
statistically significant "change points" as well as the rates,
counts, and associated errors within each block. These routines are described below.
- sitar_make_data_cells ( tt, type, max_delt, frame, tstart, tstop )
Assign events to "cells" described by the cell population and size.
Run as: isis> cell = sitar_make_data_cells(tt,type,max_delt,frame,tstart,tstop);
Inputs:
-
tt : The times of the detected events
-
type : Describe cells as "events" (type=1
or 2), with the size (i.e., normalized duration) being the
distance from halfway from the previous event to halfway to the next
event (type=1), or from the current event to right before the
subsequent event (type=2). (Note: all cell populations are
greater than or equal to one, and the total number of cells is less
than or equal to the total number of events.) Alternatively, events
can be assigned to bins of uniform size (type=3). (For this
case, cells can have zero population, and the total number of cells
can greatly exceed the total number of events.)
-
max_delt : Events closer together than
max_delt are grouped together in a single cell. (Note: this
is for grouping of consecutive events, such that the resulting
cell duration can be greater than max_delt.)
-
frame : The output cell sizes are in units of
frame, or this is the bin size for type=3.
-
tstart, tstop : The start and stop
times for creating the output cells.
Outputs:
-
cell : A structure with fields:
cell.pops - an array of the number of events per cell;
cell.size - the size (i.e. duration) of a cell in units of
frame or the mean counts per bin (derived from
the total lightcurve) in binned mode (type=3);
cell.lo_t,.hi_t - the start and stop times of the cell;
cell.dtcor - an array, currently set to unity, for
storing "dead time corrections" (or "efficiency" corrections) of
the cells. Must be changed externally to this subroutine, and
currently only used in sitar_global_optimum to deadtime
correct the output rates of the resulting blocks.
- sitar_global_optimum( cell, ncp_prior, ev_type [, first, alpha, beta] )
Find the maximum likelihood location of the change points.
Run as: isis> results = sitar_global_optimum(cell,ncp_prior,type);
Inputs:
-
cell : The output structure from
sitar_make_data_cells. Important note: If creating
`cell' by hand for `binned mode', cell.size should have the
expected counts per bin presuming a uniform rate. This is
essentially the identical procedure one takes when using `Bayesian
Blocks' for searching for emission/absorption lines in gratings
data, as described here.
-
ncp_prior : The prior probability for the number
of "blocks" is taken to be: gamma^(# of blocks). Here,
ncp_prior = -log( gamma ), and is very, very roughly
interpretable as a significance level for each detected block.
I.e., a rough proxy for significance level is 1 -
exp(-ncp_prior). (Bayesians never talk about significances this
way, but people who use codes do, so I bow to the pressure. A
better method would be to run Monte Carlo simulations to gauge the
significance, and use the above 'interpretation' to judge how many
Monte Carlo trials you might need to run. Research is currently
underway to help clarify and quantify 'significance levels' of the
algorithm.)
-
type : Identical to type from
sitar_make_data_cells.
-
first : (Optional) If first > 0, then the
code is run in "trigger" mode, and returns upon the first sign of a
change, so long as it is greater than 'first' cells from the
beginning of the lightcurve.
-
alpha, beta : (Optional) For binned mode
(type=3), the prior on the counts per bin of unit size is
presumed to be proportional to (Lambda)^(1-alpha) exp(-beta Lambda),
where Lambda is the count rate per unit sized bin. The default
values are alpha=beta=1, but these can be modified.
For event mode (type=1,2), different choice of alpha and beta
lead to different priors when caculating the likelihood function for
a block.
- alpha=0, beta=0: The default, with the likelihood
being marginalized over a uniform distribution (0->1) of p_1, the
probability that a frame has one or more events in it.
- alpha>0, beta=0: The prior on p_1 is taken as (alpha+1) *
(1-p_1)^alpha.
- alpha=0, beta>0: The prior is now based upon *rate*, and
is given by exp(-Lambda/Lambda_0), where Lambda_0 is the observed
mean rate per frame, calculated over the whole lightcurve. Note:
this is equivalent to the above, with (alpha+1) == Lambda_0^{-1}.
- alpha>0, beta>0: The prior is now uniform over rate,
between bounds given by alpha, beta. NOTE: Requires use of the GSL
module from the S-Lang Modules Packages.
- alpha=beta >0: Uniform rate prior, with the bounds
automatically chosen to be 1/3 and 3 times the mean rate
detected in an individual cell of the lightcurve. NOTE: Requires use of the GSL
module from the S-Lang Modules Packages.
Outputs:
-
results : A structure with fields
results.cpt - An array of change point locations for the
maximum likelihood solution (indices are specific to the cell
input); results.last, .best - (Diagnostic purposes only)
Arrays of the location of the last change point and the associated
maximum log probability for the truncated lightcurve with its length
equal to the array index; results.cts - Counts in each block;
results.rate - Rate in each block; results.err -
(Gherels) poisson error for the rate in each block;
results.lo_t, .hi_t - Times of lower (>=) and upper (<)
block boundaries.
|