WDmodel.fit module¶
Core data processing and fitting/sampling routines
-
WDmodel.fit.
blotch_spectrum
(spec, linedata)[source]¶ Automagically remove cosmic rays and gaps from spectrum
- Parameters
spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
linedata (
numpy.recarray
) – The observations of the spectrum corresponding to the hydrogen Balmer lines. Must havedtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8'), ('line_mask', 'i4'), ('line_ind','i4')]
Produced byorig_cut_lines()
- Returns
spec – The blotched spectrum with
dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- Return type
Notes
Some spectra have nasty cosmic rays or gaps in the data. This routine does a reasonable job blotching these by Wiener filtering the spectrum, marking features that differ significantly from the local variance in the region, and replace them with the filtered values. The hydrogen Balmer lines are preserved, so if your gap/cosmic ray lands on a line it will not be filtered. Additionally, filtering has edge effects, and these data are preserved as well. If you do blotch the spectrum, it is highly recommended that you use the bluelimit and redlimit options to trim the ends of the spectrum. Note that the spectrum will be rejected if it has flux or flux errors that are not finite or below zero. This is often the case with cosmic rays and gaps, so you will likely have to do some manual removal of these points.
YOU SHOULD PROBABLY PRE-PROCESS YOUR DATA YOURSELF BEFORE FITTING IT AND NOT BE LAZY! THIS ROUTINE ONLY EXISTS TO FIT QUICK LOOK SPECTRUM AT THE TELESCOPE, BEFORE FINAL REDUCTIONS!
-
WDmodel.fit.
fit_model
(spec, phot, model, covmodel, pbs, params, objname, outdir, specfile, phot_dispersion=0.0, samptype='ensemble', ascale=2.0, ntemps=1, nwalkers=300, nburnin=50, nprod=1000, everyn=1, thin=1, pool=None, resume=False, redo=False)[source]¶ Core routine that models the spectrum using the white dwarf model and a Gaussian process with a stationary kernel to account for any flux miscalibration, sampling the posterior using a MCMC.
- Parameters
spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
phot (
numpy.recarray
) – The photometry ofobjname
withdtype=[('pb', 'str'), ('mag', '<f8'), ('mag_err', '<f8')]
model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generatorpbs (dict) – Passband dictionary containing the passbands corresponding to phot.pb` and generated by
WDmodel.passband.get_pbmodel()
.params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
objname (str) – object name - used to save output with correct name
outdir (str) – controls where the chain file s written
specfile (str) – Used in the title, and to set the name of the
outfile
phot_dispersion (float, optional) – Excess photometric dispersion to add in quadrature with the photometric uncertainties
phot.mag_err
. Use if the errors are grossly underestimated. Default is0.
samptype (
{'ensemble', 'pt', 'gibbs'}
) – Which sampler to use. The default isensemble
.ascale (float) – The proposal scale for the sampler. Default is
2.
ntemps (int) – The number of temperatures to run walkers at. Only used if
samptype
is in{'pt','gibbs'}
and set to1.
forensemble
. See a short summary review for details. Default is1.
nwalkers (int) – The number of Goodman and Weare walkers. Default is
300
.nburnin (int) – The number of steps to discard as burn-in for the Markov-Chain. Default is
500
.nprod (int) – The number of production steps in the Markov-Chain. Default is
1000
.everyn (int, optional) – If the posterior function is evaluated using only every nth observation from the data, this should be specified. Default is
1
.thin (int) – Only save every
thin
steps to the output Markov Chain. Useful, if brute force way of reducing correlation between samples.pool (None or :py:class`emcee.utils.MPIPool`) – If running with MPI, the pool object is used to distribute the computations among the child process
resume (bool) – If
True
, restores state and resumes the chain for anothernprod
iterations.redo (bool) – If
True
, and a chain file and state file exist, simply clobbers them.
- Returns
free_param_names (list) – names of parameters that were fit for. Names correspond to keys in
params
and the order of parameters insamples
.samples (array-like) – The flattened Markov Chain with the parameter positions. Shape is
(ntemps*nwalkers*nprod, nparam)
samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in
samples
. Shape is(ntemps*nwalkers*nprod, 1)
everyn (int) – Specifies sampling of the data used to compute the posterior. Provided in case we are using
resume
to continue the chain, and this value must be restored from the state file, rather than being supplied as a user input.shape (tuple) – Specifies the shape of the un-flattened chain.
(ntemps, nwalkers, nprod, nparam)
Provided in case we are usingresume
to continue the chain, and this value must be restored from the state file, rather than being supplied as a user input.
- Raises
RuntimeError – If
resume
is set without the chain having been run in the first place.
Notes
Uses an Ensemble MCMC (implemented by emcee) to generate samples from the posterior. Does a short burn-in around the initial guess model parameters - either
minuit
or user supplied values/defaults. Model parameters may be frozen/fixed. Parameters can have bounds limiting their range. Then runs a full production change. Chain state is saved after every 100 production steps, and may be continued after the first 100 steps if interrupted or found to be too short. Progress is indicated visually with a progress bar that is written to STDOUT.See also
-
WDmodel.fit.
fix_pos
(pos, free_param_names, params)[source]¶ Ensures that the initial positions of the
emcee
walkers are out of bounds- Parameters
pos (array-like) – starting positions of all the walkers, such as that produced by
utils.sample_ball
free_param_names (iterable) – names of parameters that are free to float. Names must correspond to keys in
params
.params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
- Returns
pos – starting positions of all the walkers, fixed to guarantee that they are within
bounds
defined inparams
- Return type
array-like
Notes
emcee.utils.sample_ball()
creates random walkers that may be initialized out of bounds. These walkers get stuck as there is no step they can take that will make the change in loglikelihood finite. This makes the chain appear strongly correlated since all the samples of one walker are at a fixed location. This resolves the issue by assuming that the parametervalue
was withinbounds
to begin with. This routine does not do any checking of types, values or bounds. This check is done byWDmodel.io.get_params_from_argparse()
before the fit. If you setup the fit using an external code, you should check these values.
-
WDmodel.fit.
get_fit_params_from_samples
(param_names, samples, samples_lnprob, params, ntemps=1, nwalkers=300, nprod=1000, discard=5)[source]¶ Get the marginalized parameters from the sample chain
- Parameters
param_names (list) – names of parameters that were fit for. Names correspond to keys in
params
and the order of parameters insamples
.samples (array-like) – The flattened Markov Chain with the parameter positions. Shape is
(ntemps*nwalkers*nprod, nparam)
samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in
samples
. Shape is(ntemps*nwalkers*nprod, 1)
params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
ntemps (int) – The number of temperatures chains were run at. Default is
1.
nwalkers (int) –
The number of Goodman and Weare walkers used in the fit. Default is
300
.nprod (int) – The number of production steps in the Markov-Chain. Default is
1000
.discard (int) – percentage of nprod steps from the start of the chain to discard in analyzing samples
- Returns
mcmc_params (dict) – The output parameter dictionary with updated parameter estimates, errors and a scale.
params
.out_samples (array-like) – The flattened Markov Chain with the parameter positions with the first
%discard
tossed.out_ samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in
samples
with the first%discard
samples tossed.
See also
-
WDmodel.fit.
hyper_param_guess
(spec, phot, model, pbs, params)[source]¶ Makes a guess for the parameter
mu
after the initial fit byquick_fit_spec_model()
- Parameters
spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
phot (
numpy.recarray
) – The photometry ofobjname
withdtype=[('pb', 'str'), ('mag', '<f8'), ('mag_err', '<f8')]
model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generatorpbs (dict) – Passband dictionary containing the passbands corresponding to phot.pb` and generated by
WDmodel.passband.get_pbmodel()
.params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
- Returns
out_params – The output parameter dictionary with an initial guess for
mu
- Return type
Notes
Uses the initial guess of parameters from the spectrum fit by
quick_fit_spec_model()
to construct an initial guess of the SED, and computesmu
(which looks like a distance modulus, but also includes a normalization for the radius of the DA white dwarf, and it’s radius) as the median difference between the observed and synthetic photometry.
-
WDmodel.fit.
orig_cut_lines
(spec, model)[source]¶ Cut out the hydrogen Balmer spectral lines defined in
WDmodel.WDmodel.WDmodel
from the spectrum.The masking of Balmer lines is basic, and not very effective at high surface gravity or low temperature, or in the presence of non hydrogen lines. It’s used to get a roughly masked set of data suitable for continuum detection, and is effective in the context of our ground-based spectroscopic followup campaign for HST GO 12967 and 13711 programs.
- Parameters
spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generator
- Returns
linedata (
numpy.recarray
) – The observations of the spectrum corresponding to the hydrogen Balmer lines. Hasdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8'), ('line_mask', 'i4'), (line_ind', 'i4')]
continuumdata (
numpy.recarray
) – The continuum data. Hasdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
Notes
Does a coarse cut to remove hydrogen absorption lines from DA white dwarf spectra The line centroids, and widths are fixed and defined with the model grid This is insufficient, and particularly at high surface gravity and low temperatures the lines are blended. This routine is intended to provide a rough starting point for the process of continuum determination.
-
WDmodel.fit.
polyfit_continuum
(continuumdata, wave)[source]¶ Fit a polynomial to the DA white dwarf continuum to normalize it - purely for visualization purposes
- Parameters
continuumdata (
numpy.recarray
) – The continuum data. Must havedtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
Produced by running the spectrum throughWDmodel.fit.orig_cut_lines()
and extracting the pre-defined lines in theWDmodel.WDmodel.WDmodel
instance.wave (array-like) – The full spectrum wavelength array on which to interpolate the continuum model
- Returns
cont_model – The continuum model Must have
dtype=[('wave', '<f8'), ('flux', '<f8')]
- Return type
Notes
Roughly follows the algorithm described by the SDSS SSPP for a global continuum fit. Fits a red side and blue side at 5500 A separately to get a smooth polynomial representation. The red side uses a degree 5 polynomial and the blue side uses a degree 9 polynomial. Then “splices” them together - I don’t actually know how SDSS does this, but we simply assert the two bits are the same function - and fits the full continuum to a degree 9 polynomial.
-
WDmodel.fit.
pre_process_spectrum
(spec, bluelimit, redlimit, model, params, lamshift=0.0, vel=0.0, rebin=1, blotch=False, rescale=False)[source]¶ Pre-process the spectrum before fitting
- Parameters
spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
bluelimit (None or float) – Trim wavelengths bluer than this limit. Uses the bluest wavelength of spectrum if
None
redlimit (None or float) – Trim wavelengths redder than this limit. Uses the reddest wavelength of spectrum if
None
model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generatorparams (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
Will be modified to adjust the spectrum normalization parametersdl
limits ifrescale
is setlamshift (float, optional) – Apply a flat wavelength shift to the spectrum. Useful if the target was not properly centered in the slit, and the shift is not correlated with wavelength. Default is
0
.vel (float, optional) – Apply a velocity shift to the spectrum. Default is
0
.rebin (int, optional) – Integer factor by which to rebin the spectrum. Default is
1
(no rebinning).blotch (bool, optional) – Attempt to remove cosmic rays and gaps from spectrum. Only to be used for quick look analysis at the telescope.
rescale (bool, optional) – Rescale the spectrum to make the median noise
~1
. Has no effect on fitted parameters except spectrum flux normalization parameterdl
but makes residual plots, histograms more easily interpretable as they can be compared to anN(0, 1)
distribution.
- Returns
spec – The spectrum with
dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- Return type
-
WDmodel.fit.
quick_fit_spec_model
(spec, model, params)[source]¶ Does a quick fit of the spectrum to get an initial guess of the fit parameters
Uses iminuit to do a rough diagonal fit - i.e. ignores covariance. For simplicity, also fixed FWHM and Rv (even when set to be fit). Therefore, only teff, logg, av, dl are fit for (at most). This isn’t robust, but it’s good enough for an initial guess.
- Parameters
spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generatorparams (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
- Returns
migrad_params – The output parameter dictionary with updated initial guesses stored in the
value
key. Same format asparams
.- Return type
- Raises
RuntimeError – If all of
teff, logg, av, dl
are set as fixed - there’s nothing to fit.RuntimeWarning – If
minuit.Minuit.migrad()
orminuit.Minuit.hesse()
indicate that the fit is unreliable
Notes
None of the starting values for the parameters maybe
None
EXCEPTc
. This refines the starting guesses, and determines a reasonable value forc
-
WDmodel.fit.
rebin_spec_by_int_factor
(spec, f=1)[source]¶ Rebins a spectrum by an integer factor f
- Parameters
spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
f (int, optional) – an integer factor to rebin the spectrum by. Default is
1
(no rebinning)
- Returns
rspec – The rebinned spectrum with
dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- Return type
Notes
If the spectrum is not divisible by f, the edges are trimmed by discarding the remainder measurements from both ends. If the remainder itself is odd, the extra measurement is discarded from the blue side.