WDmodel.fit module

Core data processing and fitting/sampling routines

WDmodel.fit.blotch_spectrum(spec, linedata)[source]

Automagically remove cosmic rays and gaps from spectrum

Parameters:
  • spec (numpy.recarray) – The spectrum with dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
  • linedata (numpy.recarray) – The observations of the spectrum corresponding to the hydrogen Balmer lines. Must have dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8'), ('line_mask', 'i4'), ('line_ind','i4')] Produced by orig_cut_lines()
Returns:

spec – The blotched spectrum with dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]

Return type:

numpy.recarray

Notes

Some spectra have nasty cosmic rays or gaps in the data. This routine does a reasonable job blotching these by Wiener filtering the spectrum, marking features that differ significantly from the local variance in the region, and replace them with the filtered values. The hydrogen Balmer lines are preserved, so if your gap/cosmic ray lands on a line it will not be filtered. Additionally, filtering has edge effects, and these data are preserved as well. If you do blotch the spectrum, it is highly recommended that you use the bluelimit and redlimit options to trim the ends of the spectrum. Note that the spectrum will be rejected if it has flux or flux errors that are not finite or below zero. This is often the case with cosmic rays and gaps, so you will likely have to do some manual removal of these points.

YOU SHOULD PROBABLY PRE-PROCESS YOUR DATA YOURSELF BEFORE FITTING IT AND NOT BE LAZY! THIS ROUTINE ONLY EXISTS TO FIT QUICK LOOK SPECTRUM AT THE TELESCOPE, BEFORE FINAL REDUCTIONS!

WDmodel.fit.fit_model(spec, phot, model, covmodel, pbs, params, objname, outdir, specfile, phot_dispersion=0.0, samptype=u'ensemble', ascale=2.0, ntemps=1, nwalkers=300, nburnin=50, nprod=1000, everyn=1, thin=1, pool=None, resume=False, redo=False)[source]

Core routine that models the spectrum using the white dwarf model and a Gaussian process with a stationary kernel to account for any flux miscalibration, sampling the posterior using a MCMC.

Parameters:
  • spec (numpy.recarray) – The spectrum with dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
  • phot (numpy.recarray) – The photometry of objname with dtype=[('pb', 'str'), ('mag', '<f8'), ('mag_err', '<f8')]
  • model (WDmodel.WDmodel.WDmodel instance) – The DA White Dwarf SED model generator
  • pbs (dict) – Passband dictionary containing the passbands corresponding to phot.pb` and generated by WDmodel.passband.get_pbmodel().
  • params (dict) – A parameter dict such as that produced by WDmodel.io.read_params()
  • objname (str) – object name - used to save output with correct name
  • outdir (str) – controls where the chain file s written
  • specfile (str) – Used in the title, and to set the name of the outfile
  • phot_dispersion (float, optional) – Excess photometric dispersion to add in quadrature with the photometric uncertainties phot.mag_err. Use if the errors are grossly underestimated. Default is 0.
  • samptype ({'ensemble', 'pt', 'gibbs'}) – Which sampler to use. The default is ensemble.
  • ascale (float) – The proposal scale for the sampler. Default is 2.
  • ntemps (int) – The number of temperatures to run walkers at. Only used if samptype is in {'pt','gibbs'} and set to 1. for ensemble. See a short summary review for details. Default is 1.
  • nwalkers (int) – The number of Goodman and Weare walkers. Default is 300.
  • nburnin (int) – The number of steps to discard as burn-in for the Markov-Chain. Default is 500.
  • nprod (int) – The number of production steps in the Markov-Chain. Default is 1000.
  • everyn (int, optional) – If the posterior function is evaluated using only every nth observation from the data, this should be specified. Default is 1.
  • thin (int) – Only save every thin steps to the output Markov Chain. Useful, if brute force way of reducing correlation between samples.
  • pool (None or :py:class`emcee.utils.MPIPool`) – If running with MPI, the pool object is used to distribute the computations among the child process
  • resume (bool) – If True, restores state and resumes the chain for another nprod iterations.
  • redo (bool) – If True, and a chain file and state file exist, simply clobbers them.
Returns:

  • free_param_names (list) – names of parameters that were fit for. Names correspond to keys in params and the order of parameters in samples.
  • samples (array-like) – The flattened Markov Chain with the parameter positions. Shape is (ntemps*nwalkers*nprod, nparam)
  • samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in samples. Shape is (ntemps*nwalkers*nprod, 1)
  • everyn (int) – Specifies sampling of the data used to compute the posterior. Provided in case we are using resume to continue the chain, and this value must be restored from the state file, rather than being supplied as a user input.
  • shape (tuple) – Specifies the shape of the un-flattened chain. (ntemps, nwalkers, nprod, nparam) Provided in case we are using resume to continue the chain, and this value must be restored from the state file, rather than being supplied as a user input.

Raises:

RuntimeError – If resume is set without the chain having been run in the first place.

Notes

Uses an Ensemble MCMC (implemented by emcee) to generate samples from the posterior. Does a short burn-in around the initial guess model parameters - either minuit or user supplied values/defaults. Model parameters may be frozen/fixed. Parameters can have bounds limiting their range. Then runs a full production change. Chain state is saved after every 100 production steps, and may be continued after the first 100 steps if interrupted or found to be too short. Progress is indicated visually with a progress bar that is written to STDOUT.

WDmodel.fit.fix_pos(pos, free_param_names, params)[source]

Ensures that the initial positions of the emcee walkers are out of bounds

Parameters:
  • pos (array-like) – starting positions of all the walkers, such as that produced by utils.sample_ball
  • free_param_names (iterable) – names of parameters that are free to float. Names must correspond to keys in params.
  • params (dict) – A parameter dict such as that produced by WDmodel.io.read_params()
Returns:

pos – starting positions of all the walkers, fixed to guarantee that they are within bounds defined in params

Return type:

array-like

Notes

emcee.utils.sample_ball() creates random walkers that may be initialized out of bounds. These walkers get stuck as there is no step they can take that will make the change in loglikelihood finite. This makes the chain appear strongly correlated since all the samples of one walker are at a fixed location. This resolves the issue by assuming that the parameter value was within bounds to begin with. This routine does not do any checking of types, values or bounds. This check is done by WDmodel.io.get_params_from_argparse() before the fit. If you setup the fit using an external code, you should check these values.

WDmodel.fit.get_fit_params_from_samples(param_names, samples, samples_lnprob, params, ntemps=1, nwalkers=300, nprod=1000, discard=5)[source]

Get the marginalized parameters from the sample chain

Parameters:
  • param_names (list) – names of parameters that were fit for. Names correspond to keys in params and the order of parameters in samples.
  • samples (array-like) – The flattened Markov Chain with the parameter positions. Shape is (ntemps*nwalkers*nprod, nparam)
  • samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in samples. Shape is (ntemps*nwalkers*nprod, 1)
  • params (dict) – A parameter dict such as that produced by WDmodel.io.read_params()
  • ntemps (int) – The number of temperatures chains were run at. Default is 1.
  • nwalkers (int) –

    The number of Goodman and Weare walkers used in the fit. Default is 300.

  • nprod (int) – The number of production steps in the Markov-Chain. Default is 1000.
  • discard (int) – percentage of nprod steps from the start of the chain to discard in analyzing samples
Returns:

  • mcmc_params (dict) – The output parameter dictionary with updated parameter estimates, errors and a scale. params.
  • out_samples (array-like) – The flattened Markov Chain with the parameter positions with the first %discard tossed.
  • out_ samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in samples with the first %discard samples tossed.

See also

fit_model()

WDmodel.fit.hyper_param_guess(spec, phot, model, pbs, params)[source]

Makes a guess for the parameter mu after the initial fit by quick_fit_spec_model()

Parameters:
Returns:

out_params – The output parameter dictionary with an initial guess for mu

Return type:

dict

Notes

Uses the initial guess of parameters from the spectrum fit by quick_fit_spec_model() to construct an initial guess of the SED, and computes mu (which looks like a distance modulus, but also includes a normalization for the radius of the DA white dwarf, and it’s radius) as the median difference between the observed and synthetic photometry.

WDmodel.fit.orig_cut_lines(spec, model)[source]

Cut out the hydrogen Balmer spectral lines defined in WDmodel.WDmodel.WDmodel from the spectrum.

The masking of Balmer lines is basic, and not very effective at high surface gravity or low temperature, or in the presence of non hydrogen lines. It’s used to get a roughly masked set of data suitable for continuum detection, and is effective in the context of our ground-based spectroscopic followup campaign for HST GO 12967 and 13711 programs.

Parameters:
  • spec (numpy.recarray) – The spectrum with dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
  • model (WDmodel.WDmodel.WDmodel instance) – The DA White Dwarf SED model generator
Returns:

  • linedata (numpy.recarray) – The observations of the spectrum corresponding to the hydrogen Balmer lines. Has dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8'), ('line_mask', 'i4'), (line_ind', 'i4')]
  • continuumdata (numpy.recarray) – The continuum data. Has dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]

Notes

Does a coarse cut to remove hydrogen absorption lines from DA white dwarf spectra The line centroids, and widths are fixed and defined with the model grid This is insufficient, and particularly at high surface gravity and low temperatures the lines are blended. This routine is intended to provide a rough starting point for the process of continuum determination.

WDmodel.fit.polyfit_continuum(continuumdata, wave)[source]

Fit a polynomial to the DA white dwarf continuum to normalize it - purely for visualization purposes

Parameters:
  • continuumdata (numpy.recarray) – The continuum data. Must have dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')] Produced by running the spectrum through WDmodel.fit.orig_cut_lines() and extracting the pre-defined lines in the WDmodel.WDmodel.WDmodel instance.
  • wave (array-like) – The full spectrum wavelength array on which to interpolate the continuum model
Returns:

cont_model – The continuum model Must have dtype=[('wave', '<f8'), ('flux', '<f8')]

Return type:

numpy.recarray

Notes

Roughly follows the algorithm described by the SDSS SSPP for a global continuum fit. Fits a red side and blue side at 5500 A separately to get a smooth polynomial representation. The red side uses a degree 5 polynomial and the blue side uses a degree 9 polynomial. Then “splices” them together - I don’t actually know how SDSS does this, but we simply assert the two bits are the same function - and fits the full continuum to a degree 9 polynomial.

WDmodel.fit.pre_process_spectrum(spec, bluelimit, redlimit, model, params, lamshift=0.0, vel=0.0, rebin=1, blotch=False, rescale=False)[source]

Pre-process the spectrum before fitting

Parameters:
  • spec (numpy.recarray) – The spectrum with dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
  • bluelimit (None or float) – Trim wavelengths bluer than this limit. Uses the bluest wavelength of spectrum if None
  • redlimit (None or float) – Trim wavelengths redder than this limit. Uses the reddest wavelength of spectrum if None
  • model (WDmodel.WDmodel.WDmodel instance) – The DA White Dwarf SED model generator
  • params (dict) – A parameter dict such as that produced by WDmodel.io.read_params() Will be modified to adjust the spectrum normalization parameters dl limits if rescale is set
  • lamshift (float, optional) – Apply a flat wavelength shift to the spectrum. Useful if the target was not properly centered in the slit, and the shift is not correlated with wavelength. Default is 0.
  • vel (float, optional) – Apply a velocity shift to the spectrum. Default is 0.
  • rebin (int, optional) – Integer factor by which to rebin the spectrum. Default is 1 (no rebinning).
  • blotch (bool, optional) – Attempt to remove cosmic rays and gaps from spectrum. Only to be used for quick look analysis at the telescope.
  • rescale (bool, optional) – Rescale the spectrum to make the median noise ~1. Has no effect on fitted parameters except spectrum flux normalization parameter dl but makes residual plots, histograms more easily interpretable as they can be compared to an N(0, 1) distribution.
Returns:

spec – The spectrum with dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]

Return type:

numpy.recarray

WDmodel.fit.quick_fit_spec_model(spec, model, params)[source]

Does a quick fit of the spectrum to get an initial guess of the fit parameters

Uses iminuit to do a rough diagonal fit - i.e. ignores covariance. For simplicity, also fixed FWHM and Rv (even when set to be fit). Therefore, only teff, logg, av, dl are fit for (at most). This isn’t robust, but it’s good enough for an initial guess.

Parameters:
Returns:

migrad_params – The output parameter dictionary with updated initial guesses stored in the value key. Same format as params.

Return type:

dict

Raises:
  • RuntimeError – If all of teff, logg, av, dl are set as fixed - there’s nothing to fit.
  • RuntimeWarning – If minuit.Minuit.migrad() or minuit.Minuit.hesse() indicate that the fit is unreliable

Notes

None of the starting values for the parameters maybe None EXCEPT c. This refines the starting guesses, and determines a reasonable value for c

WDmodel.fit.rebin_spec_by_int_factor(spec, f=1)[source]

Rebins a spectrum by an integer factor f

Parameters:
  • spec (numpy.recarray) – The spectrum with dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
  • f (int, optional) – an integer factor to rebin the spectrum by. Default is 1 (no rebinning)
Returns:

rspec – The rebinned spectrum with dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]

Return type:

numpy.recarray

Notes

If the spectrum is not divisible by f, the edges are trimmed by discarding the remainder measurements from both ends. If the remainder itself is odd, the extra measurement is discarded from the blue side.