API¶
The core classes within juliet
are the load
and fit
classes. When creating a juliet.load
object, the returned object will be able to call a
fit
function which in turn returns a juliet.fit
object, which saves all the information about the fit (results statistics, posteriors, model evaluations,
etc.) — these classes are explained in detail below:
-
class
juliet.
load
(priors=None, starting_point=None, input_folder=None, t_lc=None, y_lc=None, yerr_lc=None, t_rv=None, y_rv=None, yerr_rv=None, GP_regressors_lc=None, linear_regressors_lc=None, GP_regressors_rv=None, linear_regressors_rv=None, out_folder=None, lcfilename=None, rvfilename=None, GPlceparamfile=None, GPrveparamfile=None, LMlceparamfile=None, LMrveparamfile=None, lctimedef='TDB', rvtimedef='UTC', ld_laws='quadratic', priorfile=None, lc_n_supersamp=None, lc_exptime_supersamp=None, lc_instrument_supersamp=None, mag_to_flux=True, verbose=False, matern_eps=0.01, pickle_encoding=None)¶ Given a dictionary with priors (or a filename pointing to a prior file) and data either given through arrays or through files containing the data, this class loads data into a juliet object which holds all the information about the dataset. Example usage:
>>> data = juliet.load(priors=priors,t_lc=times,y_lc=fluxes,yerr_lc=fluxes_errors)
Or, also,
>>> data = juliet.load(input_folder = folder)
Parameters: - priors –
(optional, dict or string) This can be either a python
string
or a pythondict
. If adict
, this has to contain each of the parameters to be fit, along with their respective prior distributions and hyperparameters. Each key of this dictionary has to have a parameter name (e.g.,r1_p1
,sigma_w_TESS
), and each of those elements are, in turn, dictionaries as well containing two keys: adistribution
key which defines the prior distribution of the parameter and ahyperparameters
key, which contains the hyperparameters of that distribution.- Example setup of the
priors
dictionary: >>> priors = {} >>> priors['r1_p1'] = {} >>> priors['r1_p1']['distribution'] = 'Uniform' >>> priors['r1_p1']['hyperparameters'] = [0.,1.]
If a
string
, this has to contain the filename to a proper juliet prior file; the priordict
will then be generated from there. A proper prior file has in the first column the name of the parameter, in the second the name of the distribution, and in the third the hyperparameters of that distribution for the parameter.Note that this along with either lightcurve or RV data or a
input_folder
has to be given in order to properly load a juliet data object. - Example setup of the
- starting_point – (mandatory if using MCMC, useless if using nested samplers, dict)
Dictionary indicating the starting value of each of the parameters for the MCMC run (i.e., currently only of use for
emcee
). Keys should be consistent with theprior
namings above; each key should have an associated float with the starting value. This is of no use if using nested samplers (which sample directly from the prior). - input_folder –
(optional, string) Python
string
containing the path to a folder containing all the input data — this will thus be load into a juliet data object. This input folder has to contain at least apriors.dat
file with the priors and either alc.dat
file containing lightcurve data or arvs.dat
file containing radial-velocity data. If in this folder aGP_lc_regressors.dat
file or aGP_rv_regressors.dat
file is found, data will be loaded into the juliet object as well.Note that at least this or a
priors
string or dictionary, along with either lightcurve or RV data has to be given in order to properly load a juliet data object. - t_lc –
(optional, dictionary) Dictionary whose keys are instrument names; each of those keys is expected to have arrays with the times corresponding to those instruments. For example,
>>> t_lc = {} >>> t_lc['TESS'] = np.linspace(0,100,100)
Is a valid input dictionary for
t_lc
. - y_lc – (optional, dictionary)
Similarly to
t_lc
, dictionary whose keys are instrument names; each of those keys is expected to have arrays with the fluxes corresponding to those instruments. These are expected to be consistent with thet_lc
dictionaries. - yerr_lc – (optional, dictionary)
Similarly to
t_lc
, dictionary whose keys are instrument names; each of those keys is expected to have arrays with the errors on the fluxes corresponding to those instruments. These are expected to be consistent with thet_lc
dictionaries. - GP_regressors_lc –
(optional, dictionary) Dictionary whose keys are names of instruments where a GP is to be fit. On each name/element, an array of regressors of shape
(m,n)
containing in each column then
GP regressors to be used form
photometric measurements has to be given. Note thatm
for a given instrument has to be of the same length as the correspondingt_lc
for that instrument. Also, note the order of each regressor of each instrument has to match the corresponding order in thet_lc
array. For example,>>> GP_regressors_lc = {} >>> GP_regressors_lc['TESS'] = np.linspace(-1,1,100)
If a global model wants to be used, then the instrument should be
rv
, and each of them
rows should correspond to them
times. - linear_regressors_lc – (optional, dictionary)
Similarly as for
GP_regressors_lc
, this is a dictionary whose keys are names of instruments where a linear regression is to be fit. On each name/element, an array of shape(q,p)
containing in each column thep
linear regressors to be used for theq
photometric measurements. Again, note the order of each regressor of each instrument has to match the corresponding order in thet_lc
array. - GP_regressors_rv – (optional, dictionary)
Same as
GP_regressors_lc
but for the radial-velocity data. If a global model wants to be used, then the instrument should belc
, and each of them
rows should correspond to them
times. - linear_regressors_rv – (optional, dictionary)
Same as
linear_regressors_lc
, but for the radial-velocities. - t_rv – (optional, dictionary)
Same as
t_lc
, but for the radial-velocities. - y_rv – (optional, dictionary)
Same as
y_lc
, but for the radial-velocities. - yerr_rv – (optional, dictionary)
Same as
yerr_lc
, but for the radial-velocities. - out_folder – (optional, string)
If a path is given, results will be saved to that path as a
pickle
file, along with all inputs in the standard juliet format. - lcfilename – (optional, string)
If a path to a lightcurve file is given,
t_lc
,y_lc
,yerr_lc
andinstruments_lc
will be read from there. The basic file format is a pure ascii file where times are in the first column, relative fluxes in the second, errors in the third and instrument names in the fourth. If more columns are given for a given instrument, those will be identified as linear regressors for those instruments. - rvfilename – (optional, string)
Same as
lcfilename
, but for the radial-velocities. - GPlceparamfile – (optional, string)
If a path to a file is given, the columns of that file will be used as GP regressors for the lightcurve fit. The file format is a pure ascii file
where regressors are given in different columns, and the last column holds the instrument name. The order of this file has to be consistent with
t_lc
and/or thelcfilename
file. If a global model wants to be used, set the instrument names of all regressors tolc
. - GPrveparamfile – (optional, string)
Same as
GPlceparamfile
but for the radial-velocities. If a global model wants to be used, set the instrument names of all regressors torv
. - LMlceparamfile – (optional, string)
If a path to a file is given, the columns of that file will be used as linear regressors for the lightcurve fit. The file format is a pure ascii file
where regressors are given in different columns, and the last column holds the instrument name. The order of this file has to be consistent with
t_lc
and/or thelcfilename
file. If a global model wants to be used, set the instrument names of all regressors tolc
. - LMrveparamfile – (optional, string)
Same as
LMlceparamfile
but for the radial-velocities. If a global model wants to be used, set the instrument names of all regressors torv
. - lctimedef – (optional, string)
Time definitions for each of the lightcurve instruments. Default is to assume all instruments (in lcs and rvs) have the same time definitions. If more than one instrument is given, this string
should have instruments and time-definitions separated by commas, e.g.,
TESS-TDB, LCOGT-UTC
, etc. - rvtimedef – (optional, string)
Time definitions for each of the radial-velocity instruments. Default is to assume all instruments (in lcs and rvs) have the same time definitions. If more than one instrument is given,
this string should have instruments and time-definitions separated by commas, e.g.,
FEROS-TDB, HARPS-UTC
, etc. - ld_laws – (optional, string)
Limb-darkening law to be used for each instrument. Default is
quadratic
for all instruments. If more than one instrument is given, this string should have instruments and limb-darkening laws separated by commas, e.g.,TESS-quadratic, LCOGT-linear
. - priorfile – (optional, string)
If a path to a file is given, it will be assumed this is a prior file. The
priors
dictionary will be overwritten by the data in this file. The file structure is a plain ascii file, with the name of the parameters in the first column, name of the prior distribution in the second column and hyperparameters in the third column. - lc_instrument_supersamp – (optional, array of strings)
Define for which lightcurve instruments super-sampling will be applied (e.g., in the case of long-cadence integrations). e.g.,
lc_instrument_supersamp = ['TESS','K2']
- lc_n_supersamp – (optional, array of ints)
Define the number of datapoints to supersample. Order should be consistent with order in
lc_instrument_supersamp
. e.g.,lc_n_supersamp = [20,30]
. - lc_exptime_supersamp – (optional, array of floats)
Define the exposure-time of the observations for the supersampling. Order should be consistent with order in
lc_instrument_supersamp
. e.g.,lc_exptime_supersamp = [0.020434,0.020434]
- verbose – (optional, boolean) If True, all outputs of the code are printed to terminal. Default is False.
- matern_eps – (optional, float) Epsilon parameter for the Matern approximation (see celerite documentation).
- pickle_encoding – (optional, string) Define pickle encoding in case fit was done with Python 2.7 and results are read with Python 3.
-
append_GP
(ndata, instrument_indexes, GP_arguments, inames)¶ This function appends all the GP regressors into one — useful for the global models.
-
convert_input_data
(t, y, yerr)¶ This converts the input dictionaries to arrays (this is easier to handle internally within juliet; input dictionaries are just asked because it is easier for the user to pass them).
-
convert_to_dictionary
(t, y, yerr, instrument_indexes)¶ Convert data given in arrays to dictionaries for easier user usage
-
data_preparation
(times, instruments, linear_regressors)¶ This function generates f useful internal arrays for this class: inames which saves the instrument names,
global_times
which is a “flattened” array of thetimes
dictionary where all the times for all instruments are stacked, instrument_indexes, which is a dictionary that has, for each instrument the indexes of theglobal_times
corresponding to each instrument, lm_boolean which saves booleans for each instrument to indicate if there are linear regressors and lm_arguments which are the linear-regressors for each instrument.
-
fit
(**kwargs)¶ Perhaps the most important function of the juliet data object. This function fits your data using the nested sampler of choice. This returns a results object which contains all the posteriors information.
-
generate_datadict
(dictype)¶ This generates the options dictionary for lightcurves, RVs, and everything else you want to fit. Useful for the fit, as it separaters options per instrument.
Parameters: dictype – (string) Defines the type of dictionary type. It can either be ‘lc’ (for the lightcurve dictionary) or ‘rv’ (for the radial-velocity one).
-
save_data
(fname, t, y, yerr, instruments, lm_boolean, lm_arguments)¶ This function saves t,y,yerr,instruments,lm_boolean and lm_arguments data to fname.
-
save_priorfile
(fname)¶ This function saves a priorfile file out to fname
-
save_regressors
(fname, GP_arguments)¶ This function saves the GP regressors to fname.
- priors –
-
class
juliet.
fit
(data, sampler='multinest', n_live_points=500, nwalkers=100, nsteps=300, nburnin=500, emcee_factor=0.0001, ecclim=1.0, pl=0.0, pu=1.0, ta=2458460.0, nthreads=None, use_ultranest=False, use_dynesty=False, dynamic=False, dynesty_bound='multi', dynesty_sample='rwalk', dynesty_nthreads=None, dynesty_n_effective=inf, dynesty_use_stop=True, dynesty_use_pool=None, **kwargs)¶ Given a juliet data object, this class performs a fit to the data and returns a results object to explore the results. Example usage:
>>> results = juliet.fit(data)
Parameters: data – (juliet object) An object containing all the information regarding the data to be fitted, including options of the fit. Generated via juliet.load(). On top of
data
, a series of extra keywords can be included:Parameters: - sampler – (optional, string)
String defining the sampler to be used on the fit. Current possible options include
multinest
to use PyMultiNest (via importance nested sampling),dynesty
to use Dynesty’s importance nested sampling,dynamic_dynesty
to use Dynesty’s dynamic nested sampling algorithm,ultranest
to use Ultranest,slicesampler_ultranest
to use Ultranest’s slice sampler andemcee
to use emcee. Default ismultinest
if PyMultiNest is installed;dynesty
if not. - n_live_points – (optional, int) Number of live-points to use on the nested sampling samplers. Default is 500.
- nwalkers – (optional if using emcee, int) Number of walkers to use by emcee. Default is 100.
- nsteps – (optional if using MCMC, int) Number of steps/jumps to perform on the MCMC run. Default is 300.
- nburnin – (optional if using MCMC, int) Number of burnin steps/jumps when performing the MCMC run. Default is 500.
- emcee_factor – (optional, for emcee only, float) Factor multiplying the standard-gaussian ball around which the initial position is perturbed for each walker. Default is 1e-4.
- ecclim – (optional, float)
Upper limit on the maximum eccentricity to sample. Default is
1
. - pl – (optional, float)
If the
(r1,r2)
parametrization for(b,p)
is used, this defines the lower limit of the planet-to-star radius ratio to be sampled. Default is0
. - pu – (optional, float)
Same as
pl
, but for the upper limit. Default is1
. - ta – (optional, float) Time to be substracted to the input times in order to generate the linear and/or quadratic trend to be added to the model. Default is 2458460.
- nthreads – (optinal, int) Define the number of threads to use within dynesty or emcee. Default is to use just 1. Note this will not impact PyMultiNest or UltraNest runs — these can be parallelized via MPI only.
In addition, any number of extra optional keywords can be given to the call, which will be directly ingested into the sampler of choice. For a full list of optional keywords for…
- …PyMultiNest, check the docstring of
PyMultiNest
’srun
function. - …any of the nested sampling algorithms in
dynesty
, see the docstring on therun_nested
function. - …the non-dynamic nested sampling algorithm implemented in
dynesty
, see the docstring ondynesty.dynesty.NestedSampler
in dynesty’s documentation. - …the dynamic nested sampling in
dynesty
check the docstring fordynesty.dynesty.DynamicNestedSampler
in dynesty’s documentation. - …the
ultranest
sampler, see the docstring for ultranest.integrationr.ReactiveNestedSampler in ultranest’s documentation
Finally, since
juliet
version 2.0.26, the following keywords have been deprecated, and are recommended to be removed from code usingjuliet
as they will be removed sometime in the future:Parameters: - use_dynesty – (optional, boolean)
If
True
, use dynesty instead of MultiNest for posterior sampling and evidence evaluation. Default isFalse
, unless MultiNest viapymultinest
is not working on the system. - dynamic – (optional, boolean)
If
True
, use dynamic Nested Sampling with dynesty. Default isFalse
. - dynesty_bound – (optional, string)
Define the dynesty bound method to use (currently either
single
ormulti
, to use either single ellipsoids or multiple ellipsoids). Default ismulti
(for details, see the dynesty API). - dynesty_sample –
(optional, string) Define the sampling method for dynesty to use. Default is
rwalk
. Accorfing to the dynesty API, this should be changed depending on the number of parameters being fitted. If smaller than about 20,rwalk
is optimal. For larger dimensions,slice
orrslice
should be used. - dynesty_nthreads – (optional, int) Define the number of threads to use within dynesty. Default is to use just 1.
- dynesty_n_effective – (optional, int)
Minimum number of effective posterior samples when using
dynesty
. If the estimated “effective sample size” exceeds this number, sampling will terminate. Default isNone
. - dynesty_use_stop – (optional, boolean)
Whether to evaluate the
dynesty
stopping function after each batch. Disabling this can improve performance if other stopping criteria such as maxcall are already specified. Default isTrue
. - dynesty_use_pool – (optional, dict)
A dictionary containing flags indicating where a pool in
dynesty
should be used to execute operations in parallel. These govern whetherprior_transform
is executed in parallel during initialization ('prior_transform'
), loglikelihood is executed in parallel during initialization ('loglikelihood'
), live points are proposed in parallel during a run ('propose_point'
), and bounding distributions are updated in parallel during a run ('update_bound'
). Default is True for all options.
- sampler – (optional, string)
String defining the sampler to be used on the fit. Current possible options include
The returned fit
object, in turn, also has other objects inherted in it. In particular, if results
is a juliet.fit
object, results.lc
and results.rv
are juliet.model
objects that host all the details about the dataset being modelled. This follows the model definition outlined in Section 2 of the
juliet paper:
-
class
juliet.
model
(data, modeltype, pl=0.0, pu=1.0, ecclim=1.0, ta=2458460.0, log_like_calc=False)¶ Given a juliet data object, this kernel generates either a lightcurve or a radial-velocity object. Example usage:
>>> model = juliet.model(data, modeltype = 'lc')
Parameters: - data – (juliet.load object) An object containing all the information about the current dataset.
- modeltype – (optional, string) String indicating whether the model to generate should be a lightcurve (‘lc’) or a radial-velocity (‘rv’) model.
- pl – (optional, float)
If the
(r1,r2)
parametrization for(b,p)
is used, this defines the lower limit of the planet-to-star radius ratio to be sampled. Default is0
. - pu – (optional, float)
Same as
pl
, but for the upper limit. Default is1
. - ecclim – (optional, float)
This parameter sets the maximum eccentricity allowed such that a model is actually evaluated. Default is
1
. - log_like_calc – (optional, boolean)
If True, it is assumed the model is generated to generate likelihoods values, and thus this skips the saving/calculation of the individual
models per planet (i.e.,
self.model['p1']
,self.model['p2']
, etc. will not exist). Default is False.
-
evaluate_model
(instrument=None, parameter_values=None, resampling=None, nresampling=None, etresampling=None, all_samples=False, nsamples=1000, return_samples=False, t=None, GPregressors=None, LMregressors=None, return_err=False, alpha=0.68, return_components=False, evaluate_transit=False)¶ This function evaluates the current lc or rv model given a set of posterior distribution samples and/or parameter values. Example usage:
>>> dataset = juliet.load(priors=priors, t_lc = times, y_lc = fluxes, yerr_lc = fluxes_error) >>> results = dataset.fit() >>> transit_model, error68_up, error68_down = results.lc.evaluate('TESS', return_err=True)
Or:
>>> dataset = juliet.load(priors=priors, t_rv = times, y_rv = fluxes, yerr_rv = fluxes_error) >>> results = dataset.fit() >>> rv_model, error68_up, error68_down = results.rv.evaluate('FEROS', return_err=True)
Parameters: instrument – (optional, string) Instrument the user wants to evaluate the model on. It is expected to be given for non-global models, not necessary for global models.
Parameters: parameter_values – (optional, dict) Dictionary containing samples of the posterior distribution or, more generally, parameter valuesin it. Each key is a parameter name (e.g. ‘p_p1’, ‘q1_TESS’, etc.), and inside each of those keys an array of N samples is expected (i.e., parameter_values[‘p_p1’] is an array of length N). The indexes have to be consistent between different parameters.
Parameters: resampling – (optional, boolean) Boolean indicating if the model needs to be resampled or not. Only works for lightcurves.
Parameters: nresampling – (optional, int) Number of points to resample for a given time-stamp. Only used if resampling = True. Only applicable to lightcurves.
Parameters: etresampling – (optional, double) Exposure time of the resampling (same unit as times). Only used if resampling = True. Only applicable to lightcurves.
Parameters: all_samples – (optional, boolean) If True, all posterior samples will be used to evaluate the model. Default is False.
Parameters: nsamples – (optional, int) Number of posterior samples to be used to evaluate the model. Default is 1000 (note each call to this function will sample nsamples different samples from the posterior, so no two calls are exactly the same).
Parameters: return_samples – (optional, boolean) Boolean indicating whether the user wants the posterior model samples (i.e., the models evaluated in each of the posterior sample draws) to be returned. Default is False.
Parameters: t – (optional, numpy array) Array with the times at which the model wants to be evaluated.
Parameters: GPRegressors – (optional, numpy array) Array containing the GP Regressors onto which to evaluate the models. Dimensions must be consistent with input t. If model is global, this needs to be a dictionary.
Parameters: LMRegressors – (optional, numpy array or dictionary) If the model is not global, this is an array containing the Linear Regressors onto which to evaluate the model for the input instrument. Dimensions must be consistent with input t. If model is global, this needs to be a dictionary.
Parameters: return_err – (optional, boolean) If True, this returns the credibility interval on the evaluated model. Default credibility interval is 68%.
Parameters: alpha – (optional, double) Credibility interval for return_err. Default is 0.68, i.e., the 68% credibility interval.
Parameters: return_components – (optional, boolean) If True, each component of the model is returned (i.e., the Gaussian Process component, the Linear Model component, etc.).
Parameters: evaluate_transit – (optional, boolean) If True, the function evaluates only the transit model and not the Gaussian Process or Linear Model components.
Returns: By default, the function returns the median model as evaluated with the posterior samples. Depending on the options chosen by the user, this can return up to 5 elements (in that order): model_samples, median_model, upper_CI, lower_CI and components. The first is an array with all the model samples as evaluated from the posterior. The second is the median model. The third and fourth are the uppper and lower Credibility Intervals, and the latter is a dictionary with the model components.
Finally, the juliet.load
object also contains a dictionary (juliet.load.lc_options
for lightcurves and juliet.load.rv_options
for radial-velocities)
which holds, if a gaussian-process is being used to model the noise, a juliet.gaussian_process
object. This class handles everything related to the gaussian-processes,
from model and parameter names/values, to log-likelihood evaluations. This class is defined below:
-
class
juliet.
gaussian_process
(data, model_type, instrument, george_hodlr=True, matern_eps=0.01)¶ Given a juliet data object (created via juliet.load), a model type (i.e., is this a GP for a RV or lightcurve dataset) and an instrument name, this object generates a Gaussian Process (GP) object to use within the juliet library. Example usage:
>>> GPmodel = juliet.gaussian_process(data, model_type = 'lc', instrument = 'TESS')
- :param data (juliet.load object)
- Object containing all the information about the current dataset. This will help in determining the type of kernel the input instrument has and also if the instrument has any errors associated with it to initialize the kernel.
Parameters: - model_type – (string)
A string defining the type of data the GP will be modelling. Can be either
lc
(for photometry) orrv
(for radial-velocities). - instrument – (string)
A string indicating the name of the instrument the GP is being applied to. This string simplifies cross-talk with juliet’s
posteriors
dictionary. - george_hodlr – (optional, boolean) If True, this uses George’s HODLR solver (faster).