API

ValueFunctionIterations.DiscreteAndContinuousMethod
DiscreteAndContinuous(N::Int,grids; kwargs...)

Defines a value function that combines a discrete and a continuous state variables ...

Arguments

  • N::Int: number of discrete states
  • `grids: a vector with an element for level fo the discrete state that is its self a vector with the grids for the continuous state

Key words:

  • v0: Initial value, defaults to 0.
  • order: order of BSpline approximation, defaults to BSpline(Cubic(Line(OnGrid())))
  • extrap: extrapolation method, defaults to Flat().
source
ValueFunctionIterations.DynamicProgramType
DynamicProgram

Stores the data required to define a dynamic programming problem along with the value V and policy functions P. ...

Elements:

- R: the reward function, takes the form R(s,u,X,p) where s is the state and u is the decision variable, X is random varible inputs and p are paramters
- F: the state update function, takes the form F(s,u,X,p).
- p: the state update parameters, ComponentArray it must be compatable with both R and F. 
- u: the decision variables, a matrix where each column give a possibel valueof the decision variable u. 
- X: the random variables, an AbstractRandomVariable object.
- δ: the discount factor, Float64.
- V: the value function, a AbstractValueFunction object.
- P: the policy function, a AbstractValueFunction object.
source
ValueFunctionIterations.DynamicProgramMethod
DynamicProgram(R::Function, F::Function,  p::ComponentArray, u::Matrix{Float64}, X::AbstractRandomVariable, δ::Float64, grid...; kwrds...  )

Solves a continuous state, discrete action dynamic optimization problem using value function iteration and returns the solution as a DynamicProgram object. ...

Arguments:

- V: the value function, a AbstractValueFunction object
- R: the reward function, takes the form R(s,u,X,p) where s is the state and u is the decision variable, X is random varible inputs and p are paramters
- F: the state update function, takes the form F(s,u,X,p). 
- p: the state update parameters, ComponentArray it must be compatable with both R and F. 
- u: the decision variables, a matrix where each column give a possibel valueof the decision variable u. 
- X: the random variables, an AbstractRandomVariable object.
- δ: the discount factor, Float64.

Keyword arguments:

- solve: whether to solve the problem, defaults to "conditional". The problem will not solve if the estiamted time is larger than ten minutes, this can be over ridden by setting solve = true to always solve or solvefalse to never solve. `.
- order_policy: the order of the interpolation for the policyfunction, defaults to Constant()
- extrap: the value to use for extrapolation, defaults to Float()
- tolerance: the tolerance for VFI convergence, defaults to 1e-5
- maxiter: the maximum number of iterations, defaults to round(Int, 3/(1-δ))

Values:

- a DynamicProgram object
source
ValueFunctionIterations.MCRandomVariableType
MCRandomVariable

A struct that represents a mulitvariate random variable with samples that can be updated in place. ... Elements: - N: the number of samples - dims: the dimension of the random variable - nodes: a matrix of samples - weights: a vector of weights (1/N) - sample: a function for drawing samples from the distribution

source
ValueFunctionIterations.MCRandomVariableMethod
MCRandomVariable(sample::Function, N::Int)

Initializes an instance of a MCRandomVariable using a sampler function and a desired number of samples.

The MCRandomVariable stores the sampels in a matrix of size dims x N where dims is the dimension of the random variable and N is the number of samples. Calling the MCRandomVariable object as a function will update the samples in place allowing for memory efficent resampling.

The MCRandomVariable object also stores a vector of weights which are initialized to 1/N. This allows the MCRandomVariable to be substituted for quadrature schemes represetned by the RandomVaraibles.jl interface. ...

Arguments:

- sample: a function for drawing samples from the distribution
- N: the number of samples

Values:

- a MCRandomVariable object
source
ValueFunctionIterations.MCRandomVariableMethod
MCRandomVariable(sample::Function, N::Int)

Initializes an instance of a MCRandomVariable using a sampler function and a desired number of samples and a RandomVariable object.

This function will initialize a MCRandomVariable object that represents the cartesian product of the variable represented by the sample funciton and X. Thsi is useful if you wants to represent the product of a continuous random variable with montecarlo methods and a discrete random variable by taking weighted sums.

The MCRandomVariable object stores the samples in a matrix of size d x (M x N) where d is the dimension of the random variable, and m is the number of nodes in X. Thw first elements of each node represent the sample from the sample function. The remaining elements represent the nodes from X.

The weights for each node are equal to 1/N times the corresponding weight from X. ...

Arguments:

- sample: a function for drawing samples from the distribution
- X: a RandomVariable object 
- N: the number of samples

Values:

- a MCRandomVariable object
source
ValueFunctionIterations.RandomVariableType
RandomVariable

This class defines a quadrature scheme for multi variate random varaibles. The nodes are a matrix of values with each colum corresponding to a point in the set of possible samples and the weights give the probabiltiy of that point. ... Elements: - nodes: a matrix of values - weights: a vector of weights

source
ValueFunctionIterations.RegularGridBsplineMethod
RegularGridBspline(N::Int,grids; kwargs...)

Defines a value function that combines a discrete and a continuous state variables ...

Arguments

  • dims...: a regular grid for each state variable.

Key words:

  • v0: Initial value, defaults to 0.
  • order: order of BSpline approximation, defaults to BSpline(Cubic(Line(OnGrid())))
  • extrap: extrapolation method, defaults to Flat().
source
ValueFunctionIterations.GaussHermiteRandomVariableMethod
GaussHermiteRandomVariable(m::Int64,mu::AbstractVector{Float64},Cov::AbstractMatrix{Float64})

Returns an RandomVariable with weights and nodes for a multivariate normal distribution with covariance matrix Cov and mean vector mu. The weights and nodes are chosen using a guass hermite quadrature scheme.

source
ValueFunctionIterations.MarkovChainMethod
MarkovChain(p)

Builds RandomVariable object that represents a markov chain and is compatable with ValueFunctionIterations.jl. The goal of this object is to take expectations over the outcome of a markov chain in the most efficnet possible way using the RandomVariables.jl interface.

The RandomVariable object is definged to work with the sample_markov_chain function. The nodes are intended to be passed to the sample_markov_chain function as the random number argumet. If the nodes are sampled using the weights stored in the random varible object and passed to sample_markov_chain the results wil be the same as if a uniform random number was sampled.

This allows the weights and nodes to be used to calcualte expectations over the outcome of the markov chain. with the minimum number of computations possible without changing the weights as a function of the current state. ...

Arguments:

- p: a transition probability matrix of size m

Values:

- a RandomVariable object
source
ValueFunctionIterations.action_spaceMethod
action_space_product(U...)

Returns the cartesian product of the action spaces U defined at abstract rance objects. ...

Arguments

- U: two or more AbstractRange objects

Values

- a matrix with the cartesian product of the action spaces U.
source
ValueFunctionIterations.estimate_timeMethod
estimate_time(DP::DynamicProgram)

Estimate how long a dynamic program will take to solve. ...

Arguments:

- DP: a DynamicProgram object

Value

- a dictionary with keys "Estimate", "One call", "Number of computations", "Estimated iterations"
source
ValueFunctionIterations.get_update_functionMethod
get_value_function(DP::DynamicProgram)

Returns the state update function of the dynamic program DP. ...

Arguments:

- DP: a DynamicProgram object

Values:

- a function with arguments (s,u,X,p)
source
ValueFunctionIterations.productMethod
product(X:: RandomVariable, Y:: RandomVariable)

Returns a RandomVariable that is the cartesian product of two independent RandomVariables. ...

Arguments

- X: a RandomVariable
- Y: a RandomVariable

Values

- a RandomVariable
source
ValueFunctionIterations.sample_discreteMethod
sample_discrete(p,rng)

Samples an index 1:n with probability mass for each index given by p, given a draw from a uniform random variable rng on the set (0,1). ...

Arguments:

- p: a vector of probabilities
- rng: a number in [0,1]

Values:

- an integer in 1:length(p)
source
ValueFunctionIterations.sample_markov_chainMethod
sample_markov_chain(x,p,rng)

Samples from a markov chain given the current state x and the transition matrix p using a uniform random variable rng on the unit interval. ...

Arguments:

- x: the current state (integer in 1:m)
- p: a transition probability matrix of size m
- rng: a number in [0,1]

Values:

- an integer in 1:m
source
ValueFunctionIterations.simulateMethod
simulate(DP::DynamicProgram,T::Int)

Simulates the dynamic program DP under the optimal policy for T timesteps. The function returns the states in a matrix of size dims(s) by T+1, action in a matrix of size dim(u) by T,reward in a vector of size T, and value in a vector of size T+1. ...

Arguments:

- DP: a DynamicProgram object
- T: the number of timesteps to simulate

Values:

- the states in a matrix of size dims(s) by T+1
- the actions in a matrix of size dim(u) by T  
- the rewards in a vector of size T
- the values in a vector of size T+1
source
ValueFunctionIterations.simulateMethod
simulate(DP::DynamicProgram,X::MCRandomVariable,T::Int)

Simulates the dynamic program DP under the optimal policy for T timesteps. Sampling the random variables at each timestep using the sampler in X. The function returns the states in a matrix of size dims(s) by T+1, action in a matrix of size dim(u) by T,reward in a vector of size T, and value in a vector of size T+1. ...

Arguments:

- DP: a DynamicProgram object
- X: a MCRandomVariable object
- T: the number of timesteps to simulate

Values:

- the states in a matrix of size dims(s) by T+1
- the actions in a matrix of size dim(u) by T  
- the rewards in a vector of size T
- the values in a vector of size T+1
source
ValueFunctionIterations.solve!Method
solve!(DP::DynamicProgram; kwrds...)

Rus the VFI algorithm to solve the dynamic program DP and updates the value and policy fuctions in place ...

Arguments:

- DP: a DynamicProgram object

Key words:

- order_policy: the order of the interpolation for the policyfunction, defaults to Constant()
- tolerance: the tolerance for VFI convergence, defaults to 1e-5
- maxiter: the maximum number of iterations, defaults to round(Int, 3/(1-δ))
source