Bayesian Models
UniversalDiffEq.jl provides training algorithms for uncertainty quantification in NODEs and UDEs using Bayesian NODEs and UDEs. UniversalDiffEq.jl currently supports the classical No-U-Turn Sampling (NUTS) and the Stochastic Gradient Langevin Dynamics (SGLD) algorithms. These algorithms are available for the BayesianUDE
constructor. This constructor can be created using the BayesianNODE
and BayesianCustomDerivatives
functions. These functions follow the same structure as their non-Bayesian versions NODE
and CustomDerivatives
.
UniversalDiffEq.BayesianNODE
— MethodBayesianNODE(data;kwargs ... )
Constructs a Bayesian continuous-time model for the data set data
using a single layer neural network to represent the system's dynamics.
kwargs
time_column_name
: Name of column indata
that corresponds to time. Default is"time"
.hidden_units
: Number of neurons in hidden layer. Default is10
.seed
: Fixed random seed for repeatable results. Default is1
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
UniversalDiffEq.BayesianNODE
— MethodBayesianNODE(data,X;kwargs ... )
When a data frame X
is supplied the model will run with covariates. The argument X
should have a column for time t
with the value for time in the remaining columns. The values in X
will be interpolated with a linear spline for values of time not included in the data frame.
kwargs
time_column_name
: Name of column indata
andX
that corresponds to time. Default is"time"
.variable_column_name
: Name of column inX
that corresponds to the variables. Default isnothing
.value_column_name
: Name of column inX
that corresponds to the covariates. Default isnothing
.hidden_units
: Number of neurons in hidden layer. Default is10
.seed
: Fixed random seed for repeatable results. Default is1
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
UniversalDiffEq.BayesianCustomDerivatives
— MethodBayesianCustomDerivatives(data::DataFrame,derivs!::Function,initial_parameters;kwargs ... )
Constructs a Bayesian UDE model for the data set data
based on user defined derivatives derivs
. An initial guess of model parameters are supplied with the initial_parameters
argument.
...
Arguments
- data: a DataFrame object with the time of observations in a column labeled
t
and the remaining columns the value of the state variables at each time point. - derivs: a Function of the form
derivs!(du,u,p,t)
whereu
is the value of the state variables,p
are the model parameters,t
is time, and du is updated with the value of the derivatives - init_parameters: A
NamedTuple
with the model parameters. Neural network parameters must be listed under the keyNN
.
kwargs
time_column_name
: Name of column indata
that corresponds to time. Default is"time"
.hidden_units
: Number of neurons in hidden layer. Default is10
.seed
: Fixed random seed for repeatable results. Default is1
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
...
UniversalDiffEq.BayesianCustomDerivatives
— MethodBayesianCustomDerivatives(data::DataFrame,X,derivs!::Function,initial_parameters;kwargs ... )
When a data frame X
is supplied the model will run with covariates. The argument X
should have a column for time t
with the value for time in the remaining columns. The values in X
will be interpolated with a linear spline for value of time not included in the data frame.
When X
is provided the derivs function must have the form derivs!(du,u,x,p,t)
where x
is a vector with the value of the covariates at time t
.
kwargs
time_column_name
: Name of column indata
andX
that corresponds to time. Default is"time"
.variable_column_name
: Name of column inX
that corresponds to the variables. Default isnothing
.value_column_name
: Name of column inX
that corresponds to the covariates. Default isnothing
.hidden_units
: Number of neurons in hidden layer. Default is10
.seed
: Fixed random seed for repeatable results. Default is1
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
Training of Bayesian UDEs
Instead of training Bayesian UDEs using the gradient_descent!
and BFGS!
functions, we use the algorithms for Bayesian optimization. We repeat the tutorial for regular UDEs but create a BayesianUDE
using BayesianCustomDerivatives
instead of CustomDerivatives
.
using UniversalDiffEq, DataFrames, Lux, Random
data, plt = LotkaVolterra()
dims_in = 2
hidden_units = 10
nonlinearity = tanh
dims_out = 1
NN = Lux.Chain(Lux.Dense(dims_in,hidden_units,nonlinearity),
Lux.Dense(hidden_units,dims_out))
# initialize the neural network states and parameters
rng = Random.default_rng()
NNparameters, states = Lux.setup(rng,NN)
function lotka_volterra_derivs!(du,u,p,t)
C, states = NN(u,p.NN,states)
du[1] = p.r*u[1] - C[1]
du[2] = p.theta*C[1] -p.m*u[2]
end
initial_parameters = (NN = NNparameters, r = 1.0, m = 0.5, theta = 0.5)
model = BayesianCustomDerivatives(data,lotka_volterra_derivs!,initial_parameters)
Training is then done using NUTS!
or SGLD!
:
NUTS!(model,samples = 500)
UniversalDiffEq.NUTS!
— Method NUTS!(UDE, kwargs ...)
performs Bayesian estimation on the parameters of an UDE using the NUTS sampling algorithm.
kwargs
delta
: Step size used in NUTS adaptor. Default is0.45
.samples
: Number of parameters sampled. Default is500
.burnin
: Number of samples used as burnin of Bayesian algorithm. Default issamples/10
.verbose
: Should the training loss values be printed?. Default isfalse
.
UniversalDiffEq.SGLD!
— Method SGLD!(UDE, kwargs ...)
Performs Bayesian estimation on the parameters of an UDE using the SGLD sampling algorithm. At each step t
, the stochastic update is provided by a random variable ε with mean 0 and variance:
math a(b + t-1)^γ
kwargs
a
: Default is10.0
.b
: Default is1000
.γ
: Default is 0.9.samples
: Number of parameters sampled. Default is500
.burnin
: Number of samples used as burnin of Bayesian algorithm. Default issamples/10
.verbose
: Should the training loss values be printed?. Default isfalse
.
The other functions for model analysis have methods for the BayesianUDE
constructor:
UniversalDiffEq.predict
— Methodpredict(UDE::BayesianUDE,test_data::DataFrame;summarize = true,ci = 95,df = true)
Uses the Bayesian UDE UDE
to predict the state of the data test_data
for each of the sampled parameters in training.
If summarize
is true
, this function returns the median prediction as well as the ci
% lower and upper confidence intervals. Othwerise, it returns all the individual predictions for each sampled parameter.
If df
is true, this function returns a DataFrame
object. Otherwise, it returns an Array
with the predictions.
UniversalDiffEq.plot_predictions
— Methodplot_predictions(UDE::BayesianUDE;ci=95)
Plots the correspondence between the observed state transitons and the predicitons for the model UDE
with a confidence interval ci
.
UniversalDiffEq.plot_forecast
— Methodplot_forecast(UDE::BayesianUDE, T::Int)
Plots the models forecast up to T time steps into the future from the last observation including the median prediction as well as the ci
% lower and upper confidence intervals.
UniversalDiffEq.plot_forecast
— Methodplot_forecast(UDE::BayesianUDE, test_data::DataFrame)
Plots the model's forecast over the range of the test_data
along with the value of the test data including the median prediction as well as the ci
% lower and upper confidence intervals.