Model Constructors

UniversalDiffEq.jl provides a set of functions to construct universal differential equations (UDEs) and neural ordinary differential equations (NODEs) with varying levels of customization. The model constructors all require the data to be passed using a DataFrame object from the DataFrames.jl library. The data frame should be organized with one column for time and one column for each dimension of the observations. The name of the column for time is passed using a keyword argument time_column_name that has a default value "time".

Table: Example dataset with two state variables

`time`	$y_1$	$y_2$
0.1	0.0	-1.1
0.2	0.01	-0.9
0.5	0.51	-1.05

Currently, missing data are not directly supported, but irregular intervals between time points are allowed.

Each constructor function requires additional inputs to specify the model structure. For example, the CustomDerivatives function requires the user to supply the known functional forms through the derivs! argument. The subsection for each model type describes these arguments in detail.

Finally, the constructor functions share a set of keyword arguments (kwargs) used to tune the model fitting procedure. These control the weights given to the process model, observation model, and regularization in the loss function. Larger values of the regularization weight limit the complexity of the relationships learned by the neural network to reduce the likelihood of overfitting. The observation weight controls how closely the estimated states $u_t$ match the observations $y_t$; smaller values of the observation weight correspond to datasets with larger amounts of observation error and vice versa.

proc_weight=1.0 : The weight given to the model predictions in the loss function
obs_weight=1.0 : The weight given to the state estimates in the loss function
reg_weight=0.0 : The weight given to regularization in the loss function

In addition to these weighting parameters, the keyword argument l controls how far the model will extrapolate beyond the observed data before reverting to a default value extrap_rho when forecasting.

NODEs (i.e., nonparametric universal dynamic equations)

UniversalDiffEq.jl has two functions to build time series models that use a neural network to learn the dynamics of a time series. The function NODE builds a continuous-time UDE with a neural network representing the right-hand side of the differential equation

\[ \frac{du}{dt} = NN(u;w,b),\]

The function NNDE (neural network difference equation) constructs a discrete-time model with a neural network on the right-hand side

\[ x_{t+1} = x_t + NN(x_t).\]

UniversalDiffEq.NODE — Method

NODE(data;kwargs ... )

Constructs a nonparametric continuous-time model for the data set data using a single layer neural network to represent the system's dynamics.

# kwargs

time_column_name: Name of column in data that corresponds to time. Default is "time".
hidden_units: Number of neurons in hidden layer. Default is 10.
seed: Fixed random seed for repeatable results. Default is 1.
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.
ode_solver: method to aproximate solutions to the differntail equation. Defaul is Tsit5().
ad_method:method to evalaute derivatives of the ODE solver. Default is ForwardDiffSensitivity().

source

UniversalDiffEq.NNDE — Method

NNDE(data;kwargs ...)

Constructs a nonparametric discrete-time model for the data set data using a single layer neural network to represent the system's dynamics.

# kwargs

time_column_name: Name of column in data that corresponds to time. Default is "time".
hidden_units: Number of neurons in hidden layer. Default is 10.
seed: Fixed random seed for repeatable results. Default is 1.
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.

source

Covariates can be included in the model by supplying a second data frame, X. This data frame must have the same column name for time as the primary dataset, but the time points do not need to match because the values of the covariates between time points included in the data frame X are interpolated using a linear spline. The NODE and NNDE functions will append the value of the covariates at each point in time to the neural network inputs

\[ \frac{dx}{dt} = NN(x,X(t);w,b) \\ x_{t+1} = x_t + NN(x_t, X(t)).\]

UniversalDiffEq.NODE — Method

NODE(data,X;kwargs ... )

When a data frame X is supplied the model will run with covariates. The argument X should have a column for time t with the value for time in the remaining columns. The values in X will be interpolated with a linear spline for values of time not included in the data frame.

time_column_name: Name of column in data and X that corresponds to time. Default is "time".
variable_column_name: Name of column in X that corresponds to the variables. Default is nothing.
value_column_name: Name of column in X that corresponds to the covariates. Default is nothing.
hidden_units: Number of neurons in hidden layer. Default is 10.
seed: Fixed random seed for repeatable results. Default is 1.
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.
ode_solver: method to aproximate solutions to the differntail equation. Defaul is Tsit5().
ad_method:method to evalaute derivatives of the ODE solver. Default is ForwardDiffSensitivity().

source

Long-format datasets can be used to define models with multiple covariates that have different sampling frequencies. If a long-format dataset is provided, the user must specify which column contains the variable names and which column contains the values of the variables using the variable_column_name and value_column_name keywords. In the example below, the variable column name is "variable", and the value column name is "value".

Table: Example covariate data in long format

`time`	`variable`	`value`
1.0	X_1	-1.1
2.0	X_1	-0.9
3.0	X_1	-1.05
...	...	...
1.0	X_2	3.0
2.0	X_2	0.95
4.0	X_2	-1.25

Customizing universal dynamic equations

The CustomDerivatives and CustomDifference functions can be used to build models that combine neural networks and known functional forms. These functions take user-defined models, construct a loss function, and access the model fitting and testing functions provided by UniversalDiffEq.jl.

Continuous-time models

The CustomDerivatives function builds SS-UDE models based on a user-defined function derivs!(du,u,p,t), which updates the vector du with the right-hand side of a differential equation evaluated at time t in state u given parameters p. The function also needs an initial guess at the model parameters, specified by a NamedTuple initial_parameters

UniversalDiffEq.CustomDerivatives — Method

CustomDerivatives(data,derivs!,initial_parameters;kwargs ... )

Constructs a UDE model for the data set data based on user-defined derivatives derivs. An initial guess of model parameters are supplied with the initial_parameters argument.

...

Arguments

data: a DataFrame object with the time of observations in a column labeled t and the remaining columns the value of the state variables at each time point.
derivs: a Function of the form derivs!(du,u,p,t) where u is the value of the state variables, p are the model parameters, t is time, and du is updated with the value of the derivatives
init_parameters: A NamedTuple with the model parameters. Neural network parameters must be listed under the key NN.

kwargs

time_column_name: Name of column in data that corresponds to time. Default is "time".
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.
ode_solver: method to aproximate solutions to the differntail equation. Defaul is Tsit5()
ad_method:method to evalaute derivatives of the ODE solver. Default is ForwardDiffSensitivity()

...

source

Example

The following block of code shows how to build a UDE model based on the Lotka-Volterra predator-prey model where the growth rate of the prey $r$, mortality rate of the predator $m$, and conversion efficiency $\theta$ are estimated, and the predation rate is described by a neural network $NN$

\[\frac{dN}{dt} = rN - NN(N,P) \\ \frac{dP}{dt} = \theta NN(N,P) - mP.\]

To implement the model, we define the neural network and initialize its parameters using the Lux.Chain and Lux.setup functions.

using Lux
# Build the neural network with Lux.Chain
dims_in = 2
hidden_units = 10
nonlinearity = tanh
dims_out = 1
NN = Lux.Chain(Lux.Dense(dims_in,hidden_units,nonlinearity),Lux.Dense(hidden_units,dims_out))

# initialize the neural network states and parameters
using Random
rng = Random.default_rng()
NNparameters, states = Lux.setup(rng,NN)

With the neural network in hand, we define the derivatives of the differential equations model using standard Julia syntax. The derivs function first evaluates the neural network given the abundances of the predators and prey u. The neural network function NN requires three arguments: the state variables u, the network parameters, and the network states. In this example, we store the model parameters in a named tuple p, and we access the neural network parameters with the NN. We access the other model parameters using keys corresponding to their respective names.

function lotka_volterra_derivs!(du,u,p,t)
    C, states = NN(u,p.NN,states)
    du[1] = p.r*u[1] - C[1]
    du[2] = p.theta*C[1] -p.m*u[2]
end

Finally, we can define the initial parameters as a NamedTuple and build the model using the CustomDerivatives function.

using UniversalDiffEq
initial_parameters = (NN = NNparameters, r = 1.0, m = 0.5, theta = 0.5)
model = CustomDerivatives(data,lotka_volterra_derivs!,initial_parameters)

Discrete-time models

Discrete-time models are constructed similarly to continuous-time models. The user provides a dataset, initial parameter values, and the right-hand side of a discrete-time equation with the function step

\[U_{t+1} = \text{step}(u_t,t,p).\]

The function step(u,t,p) takes three arguments: the value of the state variables u, time t, and model parameters p.

UniversalDiffEq.CustomDifference — Method

CustomDifference(data,step,initial_parameters;kwrags...)

Constructs a UDE model for the data set data based on user defined difference equation step. An initial guess of model parameters are supplied with the initial_parameters argument. ...

Arguments

data: a DataFrame object with the time of observations in a column labeled t and the remaining columns the value of the state variables at each time point.
step: a Function of the form step(u,t,p) where u is the value of the state variables, p are the model parameters.
init_parameters: A NamedTuple with the model parameters. Neural network parameters must be listed under the key NN.

kwargs

time_column_name: Name of column in data that corresponds to time. Default is "time".
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.

...

source

Adding covariates

Covariates are added to UDE models by passing a data frame X to the constructor function. The covariates must also be added as an argument to the derivs! function, which has the new form derivs!(du,u,X,p,t), where the third argument X is a vector with the value of each covariate at time t.

UniversalDiffEq.CustomDerivatives — Method

CustomDerivatives(data::DataFrame,X::DataFrame,derivs!::Function,initial_parameters;kwargs ... )

When X is provided the derivs function must have the form derivs!(du,u,x,p,t) where x is a vector with the value of the covariates at time t.

kwargs

time_column_name: Name of column in data and X that corresponds to time. Default is "time".
variable_column_name: Name of column in X that corresponds to the variables. Default is nothing.
value_column_name: Name of column in X that corresponds to the covariates. Default is nothing.
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.
ode_solver: method to aproximate solutions to the differntail equation. Defaul is Tsit5().
ad_method:method to evalaute derivatives of the ODE solver. Default is ForwardDiffSensitivity().

source

Covariates can be added to discrete-time models in the same way. In this case the step function should have four arguments step(u,X,t,p).

UniversalDiffEq.CustomDifference — Method

CustomDifference(data::DataFrame,X,step,initial_parameters;kwargs ... )

When X is provided the step function must have the form step(u,x,t,p) where x is a vector with the value of the covariates at time t.

# kwargs

time_column_name: Name of column in data and X that corresponds to time. Default is "time".
variable_column_name: Name of column in X that corresponds to the variables. Default is nothing.
value_column_name: Name of column in X that corresponds to the covariates. Default is nothing.
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.

source

Example

We extend the Lotka-Volterra equations defined in the prior example to incorporate a covariate X that influences the abundance of predators and prey. We model this effect with linear coefficients $\beta_N$ and $\beta_P$

\[\frac{dN}{dt} = rN - NN(N,P) + \beta_N N \\ \frac{dP}{dt} = \theta NN(N,P) - mP + \beta_P P.\]

# set up neural network
using Lux
dims_in = 2
hidden_units = 10
nonlinearity = tanh
dims_out = 1
NN = Lux.Chain(Lux.Dense(dims_in,hidden_units,nonlinearity),Lux.Dense(hidden_units,dims_out))

# initialize parameters
using Random
rng = Random.default_rng()
NNparameters, NNstates = Lux.setup(rng,NN)

function derivs!(du,u,X,p,t)
    C, states = NN(u,p.NN, NNstates)
    du[1] = p.r*u[1] - C[1] + p.beta[1] * X[1]
    du[2] = p.theta*C[1] -p.m*u[2] + p.beta[2] * X[1]
end

init_parameters = (NN = NNparameters, r = 1.0, m = 0.5, theta = 0.5, beta = [0,0])


model = CustomDerivatives(training_data,X,derivs!,init_parameters;proc_weight=2.0,obs_weight=0.5,reg_weight=10^-4)
nothing

Adding prior information

Users can add priors and custom neural network regularization functions by passing a function to the model constructor that takes the parameters as an argument and returns the loss associated with those parameter values. Please note that the loss functions defined by UniversalDiffEq.jl use the mean squared errors of the model rather than a likelihood function, so priors over the model parameters will not have the usual Bayesian interpretation.

UniversalDiffEq.CustomDerivatives — Method

CustomDerivatives(data::DataFrame,derivs!::Function,initial_parameters,priors::Function;kwargs ... )

When a function's priors is supplied its value will be added to the loss function as a penalty term for user specified parameters. It should take the a single NamedTuple p as an argument penalties for each paramter should be calculated by accessing p with the period operator.

The prior function can be used to nudge the fitted model toward prior expectations for a parameter value. For example, the following function increases the loss when a parameter p.r has a value other than 1.5, nad a second parameter p.beta is greater than zeros.

function priors(p)
    l = 0.01*(p.r - 1.5)^2
    l += 0.01*(p.beta)^2
    return l
end

kwargs

time_column_name: Name of column in data that corresponds to time. Default is "time".
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.
ode_solver: method to aproximate solutions to the differntail equation. Defaul is Tsit5()
ad_method:method to evalaute derivatives of the ODE solver. Default is ForwardDiffSensitivity()

source

UniversalDiffEq.CustomDifference — Method

CustomDifference(data::DataFrame,step,initial_parameters,priors::Function;kwargs ... )

When a function's priors is supplied its value will be added to the loss function as a penalty term for user-specified parameters. It should take the a single NamedTuple p as an argument penalties for each parameter should be calcualted by accessing p with the period operator.

function priors(p)
    l = 0.01*(p.r - 1.5)^2
    l += 0.01*(p.beta)^2
    return l
end

kwargs

time_column_name: Name of column in data that corresponds to time. Default is "time".
proc_weight: Weight of process error $omega_{proc}$. Default is 1.0.
obs_weight: Weight of observation error $omega_{obs}$. Default is 1.0.
reg_weight: Weight of regularization error $omega_{reg}$. Default is 10^-6.
reg_type: Type of regularization, whether "L1" or "L2" regularization. Default is "L2".
l: Extrapolation parameter for forecasting. Default is 0.25.
extrap_rho: Extrapolation parameter for forecasting. Default is 0.0.

source