WDmodel.fit module¶
Core data processing and fitting/sampling routines
-
WDmodel.fit.
blotch_spectrum
(spec, linedata)[source]¶ Automagically remove cosmic rays and gaps from spectrum
Parameters: - spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- linedata (
numpy.recarray
) – The observations of the spectrum corresponding to the hydrogen Balmer lines. Must havedtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8'), ('line_mask', 'i4'), ('line_ind','i4')]
Produced byorig_cut_lines()
Returns: spec – The blotched spectrum with
dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
Return type: Notes
Some spectra have nasty cosmic rays or gaps in the data. This routine does a reasonable job blotching these by Wiener filtering the spectrum, marking features that differ significantly from the local variance in the region, and replace them with the filtered values. The hydrogen Balmer lines are preserved, so if your gap/cosmic ray lands on a line it will not be filtered. Additionally, filtering has edge effects, and these data are preserved as well. If you do blotch the spectrum, it is highly recommended that you use the bluelimit and redlimit options to trim the ends of the spectrum. Note that the spectrum will be rejected if it has flux or flux errors that are not finite or below zero. This is often the case with cosmic rays and gaps, so you will likely have to do some manual removal of these points.
YOU SHOULD PROBABLY PRE-PROCESS YOUR DATA YOURSELF BEFORE FITTING IT AND NOT BE LAZY! THIS ROUTINE ONLY EXISTS TO FIT QUICK LOOK SPECTRUM AT THE TELESCOPE, BEFORE FINAL REDUCTIONS!
- spec (
-
WDmodel.fit.
fit_model
(spec, phot, model, covmodel, pbs, params, objname, outdir, specfile, phot_dispersion=0.0, samptype=u'ensemble', ascale=2.0, ntemps=1, nwalkers=300, nburnin=50, nprod=1000, everyn=1, thin=1, pool=None, resume=False, redo=False)[source]¶ Core routine that models the spectrum using the white dwarf model and a Gaussian process with a stationary kernel to account for any flux miscalibration, sampling the posterior using a MCMC.
Parameters: - spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- phot (
numpy.recarray
) – The photometry ofobjname
withdtype=[('pb', 'str'), ('mag', '<f8'), ('mag_err', '<f8')]
- model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generator - pbs (dict) – Passband dictionary containing the passbands corresponding to
phot.pb` and generated by
WDmodel.passband.get_pbmodel()
. - params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
- objname (str) – object name - used to save output with correct name
- outdir (str) – controls where the chain file s written
- specfile (str) – Used in the title, and to set the name of the
outfile
- phot_dispersion (float, optional) – Excess photometric dispersion to add in quadrature with the
photometric uncertainties
phot.mag_err
. Use if the errors are grossly underestimated. Default is0.
- samptype (
{'ensemble', 'pt', 'gibbs'}
) – Which sampler to use. The default isensemble
. - ascale (float) – The proposal scale for the sampler. Default is
2.
- ntemps (int) – The number of temperatures to run walkers at. Only used if
samptype
is in{'pt','gibbs'}
and set to1.
forensemble
. See a short summary review for details. Default is1.
- nwalkers (int) – The number of Goodman and Weare walkers. Default is
300
. - nburnin (int) – The number of steps to discard as burn-in for the Markov-Chain. Default is
500
. - nprod (int) – The number of production steps in the Markov-Chain. Default is
1000
. - everyn (int, optional) – If the posterior function is evaluated using only every nth
observation from the data, this should be specified. Default is
1
. - thin (int) – Only save every
thin
steps to the output Markov Chain. Useful, if brute force way of reducing correlation between samples. - pool (None or :py:class`emcee.utils.MPIPool`) – If running with MPI, the pool object is used to distribute the computations among the child process
- resume (bool) – If
True
, restores state and resumes the chain for anothernprod
iterations. - redo (bool) – If
True
, and a chain file and state file exist, simply clobbers them.
Returns: - free_param_names (list) – names of parameters that were fit for. Names correspond to keys in
params
and the order of parameters insamples
. - samples (array-like) – The flattened Markov Chain with the parameter positions.
Shape is
(ntemps*nwalkers*nprod, nparam)
- samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in
samples
. Shape is(ntemps*nwalkers*nprod, 1)
- everyn (int) – Specifies sampling of the data used to compute the posterior. Provided
in case we are using
resume
to continue the chain, and this value must be restored from the state file, rather than being supplied as a user input. - shape (tuple) – Specifies the shape of the un-flattened chain.
(ntemps, nwalkers, nprod, nparam)
Provided in case we are usingresume
to continue the chain, and this value must be restored from the state file, rather than being supplied as a user input.
Raises: RuntimeError
– Ifresume
is set without the chain having been run in the first place.Notes
Uses an Ensemble MCMC (implemented by emcee) to generate samples from the posterior. Does a short burn-in around the initial guess model parameters - either
minuit
or user supplied values/defaults. Model parameters may be frozen/fixed. Parameters can have bounds limiting their range. Then runs a full production change. Chain state is saved after every 100 production steps, and may be continued after the first 100 steps if interrupted or found to be too short. Progress is indicated visually with a progress bar that is written to STDOUT.See also
- spec (
-
WDmodel.fit.
fix_pos
(pos, free_param_names, params)[source]¶ Ensures that the initial positions of the
emcee
walkers are out of boundsParameters: - pos (array-like) – starting positions of all the walkers, such as that produced by
utils.sample_ball
- free_param_names (iterable) – names of parameters that are free to float. Names must correspond to keys in
params
. - params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
Returns: pos – starting positions of all the walkers, fixed to guarantee that they are within
bounds
defined inparams
Return type: array-like
Notes
emcee.utils.sample_ball()
creates random walkers that may be initialized out of bounds. These walkers get stuck as there is no step they can take that will make the change in loglikelihood finite. This makes the chain appear strongly correlated since all the samples of one walker are at a fixed location. This resolves the issue by assuming that the parametervalue
was withinbounds
to begin with. This routine does not do any checking of types, values or bounds. This check is done byWDmodel.io.get_params_from_argparse()
before the fit. If you setup the fit using an external code, you should check these values.- pos (array-like) – starting positions of all the walkers, such as that produced by
-
WDmodel.fit.
get_fit_params_from_samples
(param_names, samples, samples_lnprob, params, ntemps=1, nwalkers=300, nprod=1000, discard=5)[source]¶ Get the marginalized parameters from the sample chain
Parameters: - param_names (list) – names of parameters that were fit for. Names correspond to keys in
params
and the order of parameters insamples
. - samples (array-like) – The flattened Markov Chain with the parameter positions.
Shape is
(ntemps*nwalkers*nprod, nparam)
- samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in
samples
. Shape is(ntemps*nwalkers*nprod, 1)
- params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
- ntemps (int) – The number of temperatures chains were run at. Default is
1.
- nwalkers (int) –
The number of Goodman and Weare walkers used in the fit. Default is
300
. - nprod (int) – The number of production steps in the Markov-Chain. Default is
1000
. - discard (int) – percentage of nprod steps from the start of the chain to discard in analyzing samples
Returns: - mcmc_params (dict) – The output parameter dictionary with updated parameter estimates,
errors and a scale.
params
. - out_samples (array-like) – The flattened Markov Chain with the parameter positions with the first
%discard
tossed. - out_ samples_lnprob (array-like) – The flattened log of the posterior corresponding to the positions in
samples
with the first%discard
samples tossed.
See also
- param_names (list) – names of parameters that were fit for. Names correspond to keys in
-
WDmodel.fit.
hyper_param_guess
(spec, phot, model, pbs, params)[source]¶ Makes a guess for the parameter
mu
after the initial fit byquick_fit_spec_model()
Parameters: - spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- phot (
numpy.recarray
) – The photometry ofobjname
withdtype=[('pb', 'str'), ('mag', '<f8'), ('mag_err', '<f8')]
- model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generator - pbs (dict) – Passband dictionary containing the passbands corresponding to
phot.pb` and generated by
WDmodel.passband.get_pbmodel()
. - params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
Returns: out_params – The output parameter dictionary with an initial guess for
mu
Return type: Notes
Uses the initial guess of parameters from the spectrum fit by
quick_fit_spec_model()
to construct an initial guess of the SED, and computesmu
(which looks like a distance modulus, but also includes a normalization for the radius of the DA white dwarf, and it’s radius) as the median difference between the observed and synthetic photometry.- spec (
-
WDmodel.fit.
orig_cut_lines
(spec, model)[source]¶ Cut out the hydrogen Balmer spectral lines defined in
WDmodel.WDmodel.WDmodel
from the spectrum.The masking of Balmer lines is basic, and not very effective at high surface gravity or low temperature, or in the presence of non hydrogen lines. It’s used to get a roughly masked set of data suitable for continuum detection, and is effective in the context of our ground-based spectroscopic followup campaign for HST GO 12967 and 13711 programs.
Parameters: - spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generator
Returns: - linedata (
numpy.recarray
) – The observations of the spectrum corresponding to the hydrogen Balmer lines. Hasdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8'), ('line_mask', 'i4'), (line_ind', 'i4')]
- continuumdata (
numpy.recarray
) – The continuum data. Hasdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
Notes
Does a coarse cut to remove hydrogen absorption lines from DA white dwarf spectra The line centroids, and widths are fixed and defined with the model grid This is insufficient, and particularly at high surface gravity and low temperatures the lines are blended. This routine is intended to provide a rough starting point for the process of continuum determination.
- spec (
-
WDmodel.fit.
polyfit_continuum
(continuumdata, wave)[source]¶ Fit a polynomial to the DA white dwarf continuum to normalize it - purely for visualization purposes
Parameters: - continuumdata (
numpy.recarray
) – The continuum data. Must havedtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
Produced by running the spectrum throughWDmodel.fit.orig_cut_lines()
and extracting the pre-defined lines in theWDmodel.WDmodel.WDmodel
instance. - wave (array-like) – The full spectrum wavelength array on which to interpolate the continuum model
Returns: cont_model – The continuum model Must have
dtype=[('wave', '<f8'), ('flux', '<f8')]
Return type: Notes
Roughly follows the algorithm described by the SDSS SSPP for a global continuum fit. Fits a red side and blue side at 5500 A separately to get a smooth polynomial representation. The red side uses a degree 5 polynomial and the blue side uses a degree 9 polynomial. Then “splices” them together - I don’t actually know how SDSS does this, but we simply assert the two bits are the same function - and fits the full continuum to a degree 9 polynomial.
- continuumdata (
-
WDmodel.fit.
pre_process_spectrum
(spec, bluelimit, redlimit, model, params, lamshift=0.0, vel=0.0, rebin=1, blotch=False, rescale=False)[source]¶ Pre-process the spectrum before fitting
Parameters: - spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- bluelimit (None or float) – Trim wavelengths bluer than this limit. Uses the bluest wavelength of spectrum if
None
- redlimit (None or float) – Trim wavelengths redder than this limit. Uses the reddest wavelength of spectrum if
None
- model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generator - params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
Will be modified to adjust the spectrum normalization parametersdl
limits ifrescale
is set - lamshift (float, optional) – Apply a flat wavelength shift to the spectrum. Useful if the target was
not properly centered in the slit, and the shift is not correlated with
wavelength. Default is
0
. - vel (float, optional) – Apply a velocity shift to the spectrum. Default is
0
. - rebin (int, optional) – Integer factor by which to rebin the spectrum.
Default is
1
(no rebinning). - blotch (bool, optional) – Attempt to remove cosmic rays and gaps from spectrum. Only to be used for quick look analysis at the telescope.
- rescale (bool, optional) – Rescale the spectrum to make the median noise
~1
. Has no effect on fitted parameters except spectrum flux normalization parameterdl
but makes residual plots, histograms more easily interpretable as they can be compared to anN(0, 1)
distribution.
Returns: spec – The spectrum with
dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
Return type: - spec (
-
WDmodel.fit.
quick_fit_spec_model
(spec, model, params)[source]¶ Does a quick fit of the spectrum to get an initial guess of the fit parameters
Uses iminuit to do a rough diagonal fit - i.e. ignores covariance. For simplicity, also fixed FWHM and Rv (even when set to be fit). Therefore, only teff, logg, av, dl are fit for (at most). This isn’t robust, but it’s good enough for an initial guess.
Parameters: - spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- model (
WDmodel.WDmodel.WDmodel
instance) – The DA White Dwarf SED model generator - params (dict) – A parameter dict such as that produced by
WDmodel.io.read_params()
Returns: migrad_params – The output parameter dictionary with updated initial guesses stored in the
value
key. Same format asparams
.Return type: Raises: RuntimeError
– If all ofteff, logg, av, dl
are set as fixed - there’s nothing to fit.RuntimeWarning
– Ifminuit.Minuit.migrad()
orminuit.Minuit.hesse()
indicate that the fit is unreliable
Notes
None of the starting values for the parameters maybe
None
EXCEPTc
. This refines the starting guesses, and determines a reasonable value forc
- spec (
-
WDmodel.fit.
rebin_spec_by_int_factor
(spec, f=1)[source]¶ Rebins a spectrum by an integer factor f
Parameters: - spec (
numpy.recarray
) – The spectrum withdtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
- f (int, optional) – an integer factor to rebin the spectrum by. Default is
1
(no rebinning)
Returns: rspec – The rebinned spectrum with
dtype=[('wave', '<f8'), ('flux', '<f8'), ('flux_err', '<f8')]
Return type: Notes
If the spectrum is not divisible by f, the edges are trimmed by discarding the remainder measurements from both ends. If the remainder itself is odd, the extra measurement is discarded from the blue side.
- spec (