Model Constructors
UniversalDiffEq.jl provides a set of functions to construct universal differential equations (UDEs) and neural ordinary differential equations (NODEs) with varying levels of customization. The model constructors all require the data to be passed using a DataFrame object from the DataFrames.jl library. The data frame should be organized with one column for time and one column for each dimension of the observations. The name of the column for time is passed using a keyword argument time_column_name
that has a default value "time
".
Table: Example dataset with two state variables
time | $y_1$ | $y_2$ |
---|---|---|
0.1 | 0.0 | -1.1 |
0.2 | 0.01 | -0.9 |
0.5 | 0.51 | -1.05 |
Currently, missing data are not directly supported, but irregular intervals between time points are allowed.
Each constructor function requires additional inputs to specify the model structure. For example, the CustomDerivatives
function requires the user to supply the known functional forms through the derivs!
argument. The subsection for each model type describes these arguments in detail.
Finally, the constructor functions share a set of keyword arguments (kwargs
) used to tune the model fitting procedure. These control the weights given to the process model, observation model, and regularization in the loss function. Larger values of the regularization weight limit the complexity of the relationships learned by the neural network to reduce the likelihood of overfitting. The observation weight controls how closely the estimated states $u_t$ match the observations $y_t$; smaller values of the observation weight correspond to datasets with larger amounts of observation error and vice versa.
proc_weight=1.0
: The weight given to the model predictions in the loss functionobs_weight=1.0
: The weight given to the state estimates in the loss functionreg_weight=0.0
: The weight given to regularization in the loss function
In addition to these weighting parameters, the keyword argument l
controls how far the model will extrapolate beyond the observed data before reverting to a default value extrap_rho
when forecasting.
NODEs (i.e., nonparametric universal dynamic equations)
UniversalDiffEq.jl has two functions to build time series models that use a neural network to learn the dynamics of a time series. The function NODE
builds a continuous-time UDE with a neural network representing the right-hand side of the differential equation
\[ \frac{du}{dt} = NN(u;w,b),\]
The function NNDE
(neural network difference equation) constructs a discrete-time model with a neural network on the right-hand side
\[ x_{t+1} = x_t + NN(x_t).\]
UniversalDiffEq.NODE
— MethodNODE(data;kwargs ... )
Constructs a nonparametric continuous-time model for the data set data
using a single layer neural network to represent the system's dynamics.
# kwargs
time_column_name
: Name of column indata
that corresponds to time. Default is"time"
.hidden_units
: Number of neurons in hidden layer. Default is10
.seed
: Fixed random seed for repeatable results. Default is1
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
UniversalDiffEq.NNDE
— MethodNNDE(data;kwargs ...)
Constructs a nonparametric discrete-time model for the data set data
using a single layer neural network to represent the system's dynamics.
# kwargs
time_column_name
: Name of column indata
that corresponds to time. Default is"time"
.hidden_units
: Number of neurons in hidden layer. Default is10
.seed
: Fixed random seed for repeatable results. Default is1
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
Covariates can be included in the model by supplying a second data frame, X.
This data frame must have the same column name for time as the primary dataset, but the time points do not need to match because the values of the covariates between time points included in the data frame X
are interpolated using a linear spline. The NODE
and NNDE
functions will append the value of the covariates at each point in time to the neural network inputs
\[ \frac{dx}{dt} = NN(x,X(t);w,b) \\ x_{t+1} = x_t + NN(x_t, X(t)).\]
UniversalDiffEq.NODE
— MethodNODE(data,X;kwargs ... )
When a data frame X
is supplied the model will run with covariates. The argument X
should have a column for time t
with the value for time in the remaining columns. The values in X
will be interpolated with a linear spline for values of time not included in the data frame.
time_column_name
: Name of column indata
andX
that corresponds to time. Default is"time"
.variable_column_name
: Name of column inX
that corresponds to the variables. Default isnothing
.value_column_name
: Name of column inX
that corresponds to the covariates. Default isnothing
.hidden_units
: Number of neurons in hidden layer. Default is10
.seed
: Fixed random seed for repeatable results. Default is1
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
Long-format datasets can be used to define models with multiple covariates that have different sampling frequencies. If a long-format dataset is provided, the user must specify which column contains the variable names and which column contains the values of the variables using the variable_column_name
and value_column_name
keywords. In the example below, the variable column name is "variable
", and the value column name is "value
".
Table: Example covariate data in long format
time | variable | value |
---|---|---|
1.0 | X_1 | -1.1 |
2.0 | X_1 | -0.9 |
3.0 | X_1 | -1.05 |
... | ... | ... |
1.0 | X_2 | 3.0 |
2.0 | X_2 | 0.95 |
4.0 | X_2 | -1.25 |
Customizing universal dynamic equations
The CustomDerivatives
and CustomDifference
functions can be used to build models that combine neural networks and known functional forms. These functions take user-defined models, construct a loss function, and access the model fitting and testing functions provided by UniversalDiffEq.jl.
Continuous-time models
The CustomDerivatives
function builds SS-UDE models based on a user-defined function derivs!(du,u,p,t)
, which updates the vector du
with the right-hand side of a differential equation evaluated at time t
in state u
given parameters p
. The function also needs an initial guess at the model parameters, specified by a NamedTuple initial_parameters
UniversalDiffEq.CustomDerivatives
— MethodCustomDerivatives(data,derivs!,initial_parameters;kwargs ... )
Constructs a UDE model for the data set data
based on user-defined derivatives derivs
. An initial guess of model parameters are supplied with the initial_parameters
argument.
...
Arguments
- data: a DataFrame object with the time of observations in a column labeled
t
and the remaining columns the value of the state variables at each time point. - derivs: a Function of the form
derivs!(du,u,p,t)
whereu
is the value of the state variables,p
are the model parameters,t
is time, and du is updated with the value of the derivatives - init_parameters: A
NamedTuple
with the model parameters. Neural network parameters must be listed under the keyNN
.
kwargs
time_column_name
: Name of column indata
that corresponds to time. Default is"time"
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
...
Example
The following block of code shows how to build a UDE model based on the Lotka-Volterra predator-prey model where the growth rate of the prey $r$, mortality rate of the predator $m$, and conversion efficiency $\theta$ are estimated, and the predation rate is described by a neural network $NN$
\[\frac{dN}{dt} = rN - NN(N,P) \\ \frac{dP}{dt} = \theta NN(N,P) - mP.\]
To implement the model, we define the neural network and initialize its parameters using the Lux.Chain
and Lux.setup
functions.
using Lux
# Build the neural network with Lux.Chain
dims_in = 2
hidden_units = 10
nonlinearity = tanh
dims_out = 1
NN = Lux.Chain(Lux.Dense(dims_in,hidden_units,nonlinearity),Lux.Dense(hidden_units,dims_out))
# initialize the neural network states and parameters
using Random
rng = Random.default_rng()
NNparameters, states = Lux.setup(rng,NN)
With the neural network in hand, we define the derivatives of the differential equations model using standard Julia syntax. The derivs
function first evaluates the neural network given the abundances of the predators and prey u
. The neural network function NN
requires three arguments: the state variables u
, the network parameters, and the network states. In this example, we store the model parameters in a named tuple p
, and we access the neural network parameters with the NN
. We access the other model parameters using keys corresponding to their respective names.
function lotka_volterra_derivs!(du,u,p,t)
C, states = NN(u,p.NN,states)
du[1] = p.r*u[1] - C[1]
du[2] = p.theta*C[1] -p.m*u[2]
end
Finally, we can define the initial parameters as a NamedTuple and build the model using the CustomDerivatives
function.
using UniversalDiffEq
initial_parameters = (NN = NNparameters, r = 1.0, m = 0.5, theta = 0.5)
model = CustomDerivatives(data,lotka_volterra_derivs!,initial_parameters)
Discrete-time models
Discrete-time models are constructed similarly to continuous-time models. The user provides a dataset, initial parameter values, and the right-hand side of a discrete-time equation with the function step
\[U_{t+1} = \text{step}(u_t,t,p).\]
The function step(u,t,p)
takes three arguments: the value of the state variables u
, time t
, and model parameters p
.
UniversalDiffEq.CustomDifference
— MethodCustomDifference(data,step,initial_parameters;kwrags...)
Constructs a UDE model for the data set data
based on user defined difference equation step
. An initial guess of model parameters are supplied with the initial_parameters argument. ...
Arguments
- data: a DataFrame object with the time of observations in a column labeled
t
and the remaining columns the value of the state variables at each time point. - step: a Function of the form
step(u,t,p)
whereu
is the value of the state variables,p
are the model parameters. - init_parameters: A
NamedTuple
with the model parameters. Neural network parameters must be listed under the keyNN
.
kwargs
time_column_name
: Name of column indata
that corresponds to time. Default is"time"
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
...
Adding covariates
Covariates are added to UDE models by passing a data frame X
to the constructor function. The covariates must also be added as an argument to the derivs!
function, which has the new form derivs!(du,u,X,p,t)
, where the third argument X
is a vector with the value of each covariate at time t
.
UniversalDiffEq.CustomDerivatives
— MethodCustomDerivatives(data::DataFrame,X::DataFrame,derivs!::Function,initial_parameters;kwargs ... )
When a data frame X
is supplied the model will run with covariates. The argument X
should have a column for time t
with the value for time in the remaining columns. The values in X
will be interpolated with a linear spline for values of time not included in the data frame.
When X
is provided the derivs function must have the form derivs!(du,u,x,p,t)
where x
is a vector with the value of the covariates at time t
.
kwargs
time_column_name
: Name of column indata
andX
that corresponds to time. Default is"time"
.variable_column_name
: Name of column inX
that corresponds to the variables. Default isnothing
.value_column_name
: Name of column inX
that corresponds to the covariates. Default isnothing
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
Covariates can be added to discrete-time models in the same way. In this case the step
function should have four arguments step(u,X,t,p)
.
UniversalDiffEq.CustomDifference
— MethodCustomDifference(data::DataFrame,X,step,initial_parameters;kwargs ... )
When a data frame X
is supplied the model will run with covariates. The argument X
should have a column for time t
with the value for time in the remaining columns. The values in X
will be interpolated with a linear spline for value of time not included in the data frame.
When X
is provided the step function must have the form step(u,x,t,p)
where x
is a vector with the value of the covariates at time t
.
# kwargs
time_column_name
: Name of column indata
andX
that corresponds to time. Default is"time"
.variable_column_name
: Name of column inX
that corresponds to the variables. Default isnothing
.value_column_name
: Name of column inX
that corresponds to the covariates. Default isnothing
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
Example
We extend the Lotka-Volterra equations defined in the prior example to incorporate a covariate X
that influences the abundance of predators and prey. We model this effect with linear coefficients $\beta_N$ and $\beta_P$
\[\frac{dN}{dt} = rN - NN(N,P) + \beta_N N \\ \frac{dP}{dt} = \theta NN(N,P) - mP + \beta_P P.\]
# set up neural network
using Lux
dims_in = 2
hidden_units = 10
nonlinearity = tanh
dims_out = 1
NN = Lux.Chain(Lux.Dense(dims_in,hidden_units,nonlinearity),Lux.Dense(hidden_units,dims_out))
# initialize parameters
using Random
rng = Random.default_rng()
NNparameters, NNstates = Lux.setup(rng,NN)
function derivs!(du,u,X,p,t)
C, states = NN(u,p.NN, NNstates)
du[1] = p.r*u[1] - C[1] + p.beta[1] * X[1]
du[2] = p.theta*C[1] -p.m*u[2] + p.beta[2] * X[1]
end
init_parameters = (NN = NNparameters, r = 1.0, m = 0.5, theta = 0.5, beta = [0,0])
model = CustomDerivatives(training_data,X,derivs!,init_parameters;proc_weight=2.0,obs_weight=0.5,reg_weight=10^-4)
nothing
Adding prior information
Users can add priors and custom neural network regularization functions by passing a function to the model constructor that takes the parameters as an argument and returns the loss associated with those parameter values. Please note that the loss functions defined by UniversalDiffEq.jl use the mean squared errors of the model rather than a likelihood function, so priors over the model parameters will not have the usual Bayesian interpretation.
UniversalDiffEq.CustomDerivatives
— MethodCustomDerivatives(data::DataFrame,derivs!::Function,initial_parameters,priors::Function;kwargs ... )
When a function's priors is supplied its value will be added to the loss function as a penalty term for user specified parameters. It should take the a single NamedTuple p
as an argument penalties for each paramter should be calculated by accessing p
with the period operator.
The prior function can be used to nudge the fitted model toward prior expectations for a parameter value. For example, the following function increases the loss when a parameter p.r
has a value other than 1.5, nad a second parameter p.beta
is greater than zeros.
function priors(p)
l = 0.01*(p.r - 1.5)^2
l += 0.01*(p.beta)^2
return l
end
kwargs
time_column_name
: Name of column indata
that corresponds to time. Default is"time"
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.
UniversalDiffEq.CustomDifference
— MethodCustomDifference(data::DataFrame,step,initial_parameters,priors::Function;kwargs ... )
When a function's priors is supplied its value will be added to the loss function as a penalty term for user-specified parameters. It should take the a single NamedTuple p
as an argument penalties for each parameter should be calcualted by accessing p
with the period operator.
function priors(p)
l = 0.01*(p.r - 1.5)^2
l += 0.01*(p.beta)^2
return l
end
kwargs
time_column_name
: Name of column indata
that corresponds to time. Default is"time"
.proc_weight
: Weight of process error $omega_{proc}$. Default is1.0
.obs_weight
: Weight of observation error $omega_{obs}$. Default is1.0
.reg_weight
: Weight of regularization error $omega_{reg}$. Default is10^-6
.reg_type
: Type of regularization, whether"L1"
or"L2"
regularization. Default is"L2"
.l
: Extrapolation parameter for forecasting. Default is0.25
.extrap_rho
: Extrapolation parameter for forecasting. Default is0.0
.