Bayesian Models

UniversalDiffEq.jl provides training algorithms for uncertainty quantification in NODEs and UDEs using Bayesian NODEs and UDEs. UniversalDiffEq.jl currently supports the classical No-U-Turn Sampling (NUTS) and the Stochastic Gradient Langevin Dynamics (SGLD) algorithms. These algorithms are available for the BayesianUDE constructor. This constructor can be created using the BayesianNODE and BayesianCustomDerivatives functions. These functions follow the same structure as their non-Bayesian versions NODE and CustomDerivatives.

UniversalDiffEq.BayesianNODEMethod
BayesianNODE(data;kwargs ... )

Constructs a Bayesian continuous-time model for the data set data using a single layer neural network to represent the system's dynamics.

kwargs

  • time_column_name: Name of column in data that corresponds to time. Default is "time".
  • hidden_units: Number of neurons in hidden layer. Default is 10.
  • seed: Fixed random seed for repeatable results. Default is 1.
  • proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
  • obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
  • reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
  • reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
  • l: Extrapolation parameter for forecasting. Default is 0.25.
  • extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.
source
UniversalDiffEq.BayesianNODEMethod
BayesianNODE(data,X;kwargs ... )

When a data frame X is supplied the model will run with covariates. The argument X should have a column for time t with the value for time in the remaining columns. The values in X will be interpolated with a linear spline for values of time not included in the data frame.

kwargs

  • time_column_name: Name of column in data and X that corresponds to time. Default is "time".
  • variable_column_name: Name of column in X that corresponds to the variables. Default is nothing.
  • value_column_name: Name of column in X that corresponds to the covariates. Default is nothing.
  • hidden_units: Number of neurons in hidden layer. Default is 10.
  • seed: Fixed random seed for repeatable results. Default is 1.
  • proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
  • obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
  • reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
  • reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
  • l: Extrapolation parameter for forecasting. Default is 0.25.
  • extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.
source
UniversalDiffEq.BayesianCustomDerivativesMethod
BayesianCustomDerivatives(data::DataFrame,derivs!::Function,initial_parameters;kwargs ... )

Constructs a Bayesian UDE model for the data set data based on user defined derivatives derivs. An initial guess of model parameters are supplied with the initial_parameters argument.

...

Arguments

  • data: a DataFrame object with the time of observations in a column labeled t and the remaining columns the value of the state variables at each time point.
  • derivs: a Function of the form derivs!(du,u,p,t) where u is the value of the state variables, p are the model parameters, t is time, and du is updated with the value of the derivatives
  • init_parameters: A NamedTuple with the model parameters. Neural network parameters must be listed under the key NN.

kwargs

  • time_column_name: Name of column in data that corresponds to time. Default is "time".
  • hidden_units: Number of neurons in hidden layer. Default is 10.
  • seed: Fixed random seed for repeatable results. Default is 1.
  • proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
  • obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
  • reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
  • reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
  • l: Extrapolation parameter for forecasting. Default is 0.25.
  • extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.

...

source
UniversalDiffEq.BayesianCustomDerivativesMethod
BayesianCustomDerivatives(data::DataFrame,X,derivs!::Function,initial_parameters;kwargs ... )

When a data frame X is supplied the model will run with covariates. The argument X should have a column for time t with the value for time in the remaining columns. The values in X will be interpolated with a linear spline for value of time not included in the data frame.

When X is provided the derivs function must have the form derivs!(du,u,x,p,t) where x is a vector with the value of the covariates at time t.

kwargs

  • time_column_name: Name of column in data and X that corresponds to time. Default is "time".
  • variable_column_name: Name of column in X that corresponds to the variables. Default is nothing.
  • value_column_name: Name of column in X that corresponds to the covariates. Default is nothing.
  • hidden_units: Number of neurons in hidden layer. Default is 10.
  • seed: Fixed random seed for repeatable results. Default is 1.
  • proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
  • obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
  • reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
  • reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
  • l: Extrapolation parameter for forecasting. Default is 0.25.
  • extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.
source

Training of Bayesian UDEs

Instead of training Bayesian UDEs using the gradient_descent! and BFGS! functions, we use the algorithms for Bayesian optimization. We repeat the tutorial for regular UDEs but create a BayesianUDE using BayesianCustomDerivatives instead of CustomDerivatives.

using UniversalDiffEq, DataFrames, Lux, Random

data, plt = LotkaVolterra()
dims_in = 2
hidden_units = 10
nonlinearity = tanh
dims_out = 1
NN = Lux.Chain(Lux.Dense(dims_in,hidden_units,nonlinearity),
        Lux.Dense(hidden_units,dims_out))

# initialize the neural network states and parameters 
rng = Random.default_rng() 
NNparameters, states = Lux.setup(rng,NN) 

function lotka_volterra_derivs!(du,u,p,t)
    C, states = NN(u,p.NN,states) 
    du[1] = p.r*u[1] - C[1]
    du[2] = p.theta*C[1] -p.m*u[2]
end

initial_parameters = (NN = NNparameters, r = 1.0, m = 0.5, theta = 0.5)
model = BayesianCustomDerivatives(data,lotka_volterra_derivs!,initial_parameters)

Training is then done using NUTS! or SGLD!:

NUTS!(model,samples = 500)
UniversalDiffEq.NUTS!Method
 NUTS!(UDE, kwargs ...)

performs Bayesian estimation on the parameters of an UDE using the NUTS sampling algorithm.

kwargs

  • delta: Step size used in NUTS adaptor. Default is 0.45.
  • samples: Number of parameters sampled. Default is 500.
  • burnin: Number of samples used as burnin of Bayesian algorithm. Default is samples/10.
  • verbose: Should the training loss values be printed?. Default is false.
source
UniversalDiffEq.SGLD!Method
 SGLD!(UDE, kwargs ...)

Performs Bayesian estimation on the parameters of an UDE using the SGLD sampling algorithm. At each step t, the stochastic update is provided by a random variable ε with mean 0 and variance:

math a(b + t-1)^γ

kwargs

  • a: Default is 10.0.
  • b: Default is 1000.
  • γ: Default is 0.9.
  • samples: Number of parameters sampled. Default is 500.
  • burnin: Number of samples used as burnin of Bayesian algorithm. Default is samples/10.
  • verbose: Should the training loss values be printed?. Default is false.
source

The other functions for model analysis have methods for the BayesianUDE constructor:

UniversalDiffEq.predictMethod
predict(UDE::BayesianUDE,test_data::DataFrame;summarize = true,ci = 95,df = true)

Uses the Bayesian UDE UDE to predict the state of the data test_data for each of the sampled parameters in training.

If summarize is true, this function returns the median prediction as well as the ci% lower and upper confidence intervals. Othwerise, it returns all the individual predictions for each sampled parameter.

If df is true, this function returns a DataFrame object. Otherwise, it returns an Array with the predictions.

source
UniversalDiffEq.plot_predictionsMethod
plot_predictions(UDE::BayesianUDE;ci=95)

Plots the correspondence between the observed state transitons and the predicitons for the model UDE with a confidence interval ci.

source
UniversalDiffEq.plot_forecastMethod
plot_forecast(UDE::BayesianUDE, T::Int)

Plots the models forecast up to T time steps into the future from the last observation including the median prediction as well as the ci% lower and upper confidence intervals.

source
UniversalDiffEq.plot_forecastMethod
plot_forecast(UDE::BayesianUDE, test_data::DataFrame)

Plots the model's forecast over the range of the test_data along with the value of the test data including the median prediction as well as the ci% lower and upper confidence intervals.

source