NAME specdetect -- Detect features in a grating spectrum. USAGE specdetect [input_data_file] [num_cols_in_data] [flux_corrected] [output_peaks_file] [output_continuum_file] [data_minus_cont_file] [thresh_emission] [grating] [input_continuum_file] [thresh_absoption=] [continuum_method=] [median_n_bins=] [cont_offset=] [head_tail_cont_method] [head_value] [tail_value] [low_tmedian=] [high_tmedian=] [clobber=] [debug=] DESCRIPTION This program finds local maxima in one-dimensional grating data. A peak is defined as any bin where the counts are higher than at both neighboring bins. Absorption features and edges are not detected. If no continuum is supplied by the user, the program uses one of three methods to estimate the continuum level in the input spectrum. The program then searches for maxima which exceed the input noise threshold above the continuum. Detected maxima are subtracted from the spectrum, and the continuum calculation is repeated to refine the continuum estimate. Once the continuum level is determined, the scan for maxima is repeated to get the final list of detected features and approximate widths and amplitudes for use as initial guesses in subsequent line profile fitting. There are several output files from this tool; 1) root.peaks file contains the list of features found with the initial guesses for the fitting routines; 2) root_cont.dat file contains the calculated continuum for the entire data set; 3) root_nocont.dat file contains a copy of the data minus the continuum level. These output files may be used as input to line_measure_features. PARAMETERS input_data_file [filename] (root.dat) Input ascii data file containing 3 or 4 columns: wavelength, wavelength bin width (optional), counts or flux, and counts_error or flux_error. If the bin width column is not provided, the bin width is calculated using the wavelngth positions in the list, assuming no gaps in the data. The data columns are: col1=position, col2=optional bin width; if bin width present: col3=counts or flux, col4=counts_err or flux_err; if bin width not present: col2=counts or flux, col3=counts_err or flux_err. Sample file with 3 columns: lam counts counts_err 1.448900 1 1.0 1.460100 1 1.0 1.471200 2 1.41 1.482300 3 1.73 1.493400 3 1.73 1.504600 3 1.73 1.515700 2 1.41 1.526800 1 1.0 1.537900 0 0 1.549000 0 0 Sample file with 4 columns: lam lam_wid counts counts_err 1.448900 0.011200 1 1.0 1.460100 0.011100 1 1.0 1.471200 0.011100 2 1.41 1.482300 0.011100 3 1.73 1.493400 0.011200 3 1.73 1.504600 0.011100 3 1.73 1.515700 0.011100 2 1.41 1.526800 0.011100 1 1.0 1.537900 0.011100 0 0 1.549000 0.011200 0 0 The data file must end with ".dat" since ascfit reads only the format for now. num_cols_in_data = "3" [int: 3 or 4] The input data file contains either 3 or 4 columns of info. flux_corrected = no [boolean] Is the input file flux corrected? This parameter is currently not being used. output_peaks_file = "." [filename] (root.peaks) Output file for the detected peak information. "" or "stdout" or "none" displays output on the screen without saving it to a file. "." creates an output file name based on the input data file name, e.g., if data_file = "file.dat", then output_file = "file.peaks". output_continuum_file = "." [filename] (root_cont.dat) Output file for the continuum of the data set. This file contains four columns of information: wavelength, wavelength bin width, continuum counts, continuum counts error. The file is the same length as the input data file, with the same wavelength range and step sizes. The continuum counts are calculated via this tool, using the continuum detection methods listed below. The default counts error for the continuum are set to 0. "" or "stdout" or "none" displays output on the screen without saving it to a file. "." creates an output file name based on the input data file name, e.g., if data_file = "file.dat", then output_file = "file_cont.dat". data_minus_cont_file = "." [filename] (root_nocont.dat) Output file for the (data-continuum) information. This file contains four columns of information: wavelength, wavelength bin width, counts=(data-continumm) and counts_error= sqrt(data_error^2 + continuum_error^2). The file is the same length as the input data file, with the same wavelength range and step sizes. The continuum counts are calculated via this tool, using the continuum detection methods listed below. The (data - continuum) calculation is done once the continuum has been fit, using the original input data. "" or "stdout" or "none" displays output on the screen without saving it to a file. "." creates an output file name based on the input data file name, e.g., if data_file = "file.dat", then output_file = "file_cont.dat". thresh_emission = 4.0 [real] Minimum emission peak amplitude threshold. The emission line detect threshold = (thresh_emission * uncertainty). The threshold value is not from the minimum scale of 0 counts; the threshold is set from the continuum, as the minimum point. grating = HEG [HEG|heg|MEG|meg|LEG|leg] Grating associated with the input data. While the data is in ascii format, the tool needs to know the grating used to create the dataset. This parameter will go away with the use of FITS i/o. (input_continuum_file = none) [filename or none] (root_cont.dat) Optional input continuum file name. The file must have all 4 columns: position, col2=bin width, col3=counts or flux, col3=counts_err or flux_err. The length of the file must be the same as the input data file length. The output continuum file will not be calculated internally. All calculations using the continuum will use this file, if the file is specified. For a sample file, see the data_file parameter above (4 column data set). (thresh_absoption = 2.0) [real] Minimum absorption peak amplitude. The absorption line detect threshold = (thresh_absorption * uncertainty). This parameter is currently not being used by the tool, since there is no algorithm in place for the absorption line detection. (continuum_method = "median") [spline3|median|tmedian] Find the continuum using any of these methods: smoothing 'spline3', running 'median', truncated running 'tmedian'. The smoothing spline uses: ALGORITHM 642 COLLECTED ALGORITHMS FROM ACM. ALGORITHM APPEARED IN ACM-TRANS. MATH. SOFTWARE, VOL.12, NO. 2, JUN., 1986, P. 150. SUBROUTINE NAME - CUBGCV --------------------------------------------------------------- COMPUTER - VAX/DOUBLE AUTHOR - M.F.HUTCHINSON CSIRO DIVISION OF MATHEMATICS AND STATISTICS P.O. BOX 1965 CANBERRA, ACT 2601 AUSTRALIA The running median method involves calculating the median within a window of width "median_n_bins" (a parameter). The calculated median value is used as the continuum level at location + median_n_bins/2, and the window is shifted one bin. Once the running median is calculated, the head bins median_n_bins/2 and tail bins median_n_bins/2 of the continuum data remain blank. There are 4 methods of filling in these sections of data; see head_tail_cont_method method below. 1 25 50 75 100 125 150 175 |----------------| calculates median value for point 25 |----------------| calculates median value for point 26 |----------------| calculates median value for point 27 |----------------| calculates median value for point 28 |----------------| calculates median value for point ... |----------------| The truncated running median is similar to the running median, but with the option to exclude bins with counts outside a given range. The tool parameters "low_tmedian" and "high_tmedian" define this range. For example, computing the truncated median with high_tmedian=0.80 excludes bins ranked in the top 20 percent in a given window. If low_tmedian and high_tmedian are both 0, this is equivalent to the running median method without truncation. The sum (high_tmedian + low_tmedian) must be in the range (0,1), otherwise all the data will be excluded. If the median or tmedian option is used, the parameters median_n_bins and head_tail_cont_method must be set (see below). If the tmedian method is chosen, the low_tmedian and high_tmedian parameters must be set (see below). (median_n_bins = 50) [even integer > 4] Number of bins for running or truncated median of continuum calculations. (cont_offset = 1) [integer 1 -> 3] Offset from line center to sample continuum [Line Spread Function width]. The continuum slope calculations are done using formulae derived from the resolving power data plotted in the proposer's guide. Line Spread Function width LSF_WIDTH = (1.695e-2) * (wavelength ** -0.1120) heg = (3.124e-2) * (wavelength ** -0.1689) meg = (5.299e-2) * (wavelength ** -0.0918) leg where wavelength = photon position in Angstroms. To estimate the continuum slope for a peak detected at pos_peak: cont_guess_left = data( pos_peak - cont_offset * LSF_WIDTH ) cont_guess_right = data( pos_peak + cont_offset * LSF_WIDTH ) cont_slope = (cont_guess_right - cont_guess_left) / ( 2 * cont_offset * LSF_WIDTH ) The equations are solved for input spectrum "data" which is given as a function of energy units. Cont_offset is an adjustment parameter. The calculations are done in wavelength units if the 'flux_corrected' parameter is set to 'NO'. (head_tail_cont_method = default) [string: default|values| extrapolate|copy] Method to calculate the head and tail of the continuum, when using the running median or truncated median continuum calculation methods. The head and tail sections are equivalent to median_n_bins/2 on each end of the spectrum. The current methods are: default -> sets the head and tail bins to 0 counts values -> sets the head and tail to user values; values are set in the paramerters called "head_value" and "tail_value", described below extrapolate -> uses extrapolation of the neighbors to calculate the head/tail; the two numbers used for the extrapolation are the nearest 2 neighbors with calculated continuum values. copy -> uses a copy of the neighbor data values to get the extra points outside the head and tail sections in order to perform median/tmedian calculations on the head and the tail. EX: copy of this ^ ^ | | |-------|-------| -25 25 50 |----------------| calculates median value for point 1 |----------------| calculates median value for point 2 |----------------| calculates median value for point 3 |----------------| calculates median value for point 4 |----------------| calculates median value for point ... |----------------| (head_value = 0) [real] If head_tail_cont_method=values, set short wavelength (left) end here. The left median_n_bins/2 will all be set to this value. (tail_value = 0) [real] If head_tail_cont_method=values, set long wavelength (right) end here. The right median_n_bins/2 will all be set to this value. (low_tmedian = 0.20) [real 0 -> 1] For truncated median, reject bin values <= (low_tmedian * median). The sum (high_tmedian + low_tmedian) must be less than 1. (high_tmedian = 0) [real 0 -> 1] For truncated median, reject bins values >= (high_tmedian * median). The sum (high_tmedian + low_tmedian) must be less than 1. (clobber = no) [boolean] If set to yes, existing output files will be overwritten. (display = 1) [integer 0:5] Program verbosity. 0 means display no messages to the screen while running. 5 means report maximum on what the program is doing. (mode = "ql") [string] Parameter interface mode. "ql" stands for "query and learn", which prompts you for input and updates "specdetect.par" with your responses. EXAMPLES 1. Find peaks with amplitudes of at least 40 counts above the continuum, using the cubic spline method, from file marx_zoo.dat > pset specdetect input data file [root.dat] (new_marx_zoo2.dat): number of columns in input data file (3:4) (3): Is the input file flux corected? (no): output file name [root.peaks] (.): output continuum file name [root_cont.dat] (.): output (data-continuum) file name [root_nocont.dat] (.): emission line detect threshold = (thresh_emission * uncertainty) (0) (4): Grating associated with the input data (HEG|heg|MEG|meg|LEG|leg) (HEG): optional input continuum file name [root_cont.dat] (none): absorption line detect threshold = (thresh_absorption * uncertainty) (0) (2): Continuum calculation method (median|spline3|tmedian|test) (median): Number of bins for median of continuum calc - EVEN number (4) (50): Offset from line center to sample continuum [LSF width] (1:20) (1): Head and tail t/median continuum calc (default|values|extrapolate|copy) (extrapolate): If head_tail_cont_method=values, set short wavelength (left) end (0): If head_tail_cont_method=values, set long wavelength (right) end (0): For truncated median, reject bin values <= (low_tmedian * median) (0:1) (0.2): For truncated median, reject bins values >= (high_tmedian * median) (0:1) (0): OK to overwrite existing output files? (yes): debug level (0 = no debug) (0:6) (1): mode (ql): read 2574 data points found 16 peaks The output .peaks file looks like this: # # data file: new_marx_zoo2.dat # minimum amplitude emission: 4.0 # minimum amplitude absoption: 2.0 # peaks: 16 # # peak ampl cont cont_slope fwhm pos pos_bin_wid # ----------------------------------------------------------------------------- 1 2.150000e+02 4.525000e+01 2.682681e+02 3.340000e-02 6.176100 0.011100 2 1.210000e+02 6.724551e+01 4.589025e+01 3.340000e-02 7.166000 0.011200 3 1.380000e+02 5.975000e+01 9.332377e+01 4.450000e-02 7.989100 0.011100 4 1.610000e+02 6.337275e+01 -3.272002e+01 4.450000e-02 8.311700 0.011100 5 1.980000e+02 6.165390e+01 -3.168631e+01 3.330000e-02 8.422900 0.011100 6 1.230000e+02 1.700000e+01 -2.562167e+01 3.300000e-02 10.614000 0.011000 7 7.600000e+01 1.618116e+01 -1.922536e+01 3.400000e-02 10.659000 0.011000 8 6.500000e+01 1.691667e+01 2.630394e+01 7.800000e-02 10.981000 0.011000 9 7.800000e+01 2.075000e+01 1.538023e+02 7.800000e-02 11.026000 0.011000 10 9.600000e+01 2.125000e+01 0.000000e+00 3.300000e-02 11.181000 0.011000 11 4.900000e+01 1.709091e+01 -6.126157e+01 2.200000e-02 11.270000 0.011000 12 6.700000e+01 1.100379e+01 -2.568731e+01 3.300000e-02 11.426000 0.011000 13 1.060000e+02 1.004167e+01 0.000000e+00 2.200000e-02 11.760000 0.011000 14 6.400000e+01 7.000000e+00 0.000000e+00 3.300000e-02 12.138000 0.011000 15 4.400000e+01 7.000000e+00 0.000000e+00 4.500000e-02 12.182000 0.012000 16 3.700000e+01 1.279412e+00 -4.624471e+00 4.500000e-02 18.978000 0.012000 DATE 05/14/98