API
ValueFunctionIterations.DiscreteAndContinuous
— MethodDiscreteAndContinuous(N::Int,grids; kwargs...)
Defines a value function that combines a discrete and a continuous state variables ...
Arguments
N::Int
: number of discrete states- `grids: a vector with an element for level fo the discrete state that is its self a vector with the grids for the continuous state
Key words:
- v0: Initial value, defaults to 0.
- order: order of BSpline approximation, defaults to BSpline(Cubic(Line(OnGrid())))
- extrap: extrapolation method, defaults to Flat().
ValueFunctionIterations.DynamicProgram
— TypeDynamicProgram
Stores the data required to define a dynamic programming problem along with the value V and policy functions P. ...
Elements:
- R: the reward function, takes the form R(s,u,X,p) where s is the state and u is the decision variable, X is random varible inputs and p are paramters
- F: the state update function, takes the form F(s,u,X,p).
- p: the state update parameters, ComponentArray it must be compatable with both R and F.
- u: the decision variables, a matrix where each column give a possibel valueof the decision variable u.
- X: the random variables, an AbstractRandomVariable object.
- δ: the discount factor, Float64.
- V: the value function, a AbstractValueFunction object.
- P: the policy function, a AbstractValueFunction object.
ValueFunctionIterations.DynamicProgram
— MethodDynamicProgram(R::Function, F::Function, p::ComponentArray, u::Matrix{Float64}, X::AbstractRandomVariable, δ::Float64, grid...; kwrds... )
Solves a continuous state, discrete action dynamic optimization problem using value function iteration and returns the solution as a DynamicProgram object. ...
Arguments:
- V: the value function, a AbstractValueFunction object
- R: the reward function, takes the form R(s,u,X,p) where s is the state and u is the decision variable, X is random varible inputs and p are paramters
- F: the state update function, takes the form F(s,u,X,p).
- p: the state update parameters, ComponentArray it must be compatable with both R and F.
- u: the decision variables, a matrix where each column give a possibel valueof the decision variable u.
- X: the random variables, an AbstractRandomVariable object.
- δ: the discount factor, Float64.
Keyword arguments:
- solve: whether to solve the problem, defaults to "conditional". The problem will not solve if the estiamted time is larger than ten minutes, this can be over ridden by setting solve = true to always solve or solvefalse to never solve. `.
- order_policy: the order of the interpolation for the policyfunction, defaults to Constant()
- extrap: the value to use for extrapolation, defaults to Float()
- tolerance: the tolerance for VFI convergence, defaults to 1e-5
- maxiter: the maximum number of iterations, defaults to round(Int, 3/(1-δ))
Values:
- a DynamicProgram object
ValueFunctionIterations.MCRandomVariable
— TypeMCRandomVariable
A struct that represents a mulitvariate random variable with samples that can be updated in place. ... Elements: - N: the number of samples - dims: the dimension of the random variable - nodes: a matrix of samples - weights: a vector of weights (1/N) - sample: a function for drawing samples from the distribution
ValueFunctionIterations.MCRandomVariable
— MethodMCRandomVariable(sample::Function, N::Int)
Initializes an instance of a MCRandomVariable using a sampler function and a desired number of samples.
The MCRandomVariable stores the sampels in a matrix of size dims x N where dims is the dimension of the random variable and N is the number of samples. Calling the MCRandomVariable object as a function will update the samples in place allowing for memory efficent resampling.
The MCRandomVariable object also stores a vector of weights which are initialized to 1/N. This allows the MCRandomVariable to be substituted for quadrature schemes represetned by the RandomVaraibles.jl interface. ...
Arguments:
- sample: a function for drawing samples from the distribution
- N: the number of samples
Values:
- a MCRandomVariable object
ValueFunctionIterations.MCRandomVariable
— MethodMCRandomVariable(sample::Function, N::Int)
Initializes an instance of a MCRandomVariable using a sampler function and a desired number of samples and a RandomVariable object.
This function will initialize a MCRandomVariable object that represents the cartesian product of the variable represented by the sample funciton and X. Thsi is useful if you wants to represent the product of a continuous random variable with montecarlo methods and a discrete random variable by taking weighted sums.
The MCRandomVariable object stores the samples in a matrix of size d x (M x N) where d is the dimension of the random variable, and m is the number of nodes in X. Thw first elements of each node represent the sample from the sample function. The remaining elements represent the nodes from X.
The weights for each node are equal to 1/N times the corresponding weight from X. ...
Arguments:
- sample: a function for drawing samples from the distribution
- X: a RandomVariable object
- N: the number of samples
Values:
- a MCRandomVariable object
ValueFunctionIterations.RandomVariable
— TypeRandomVariable
This class defines a quadrature scheme for multi variate random varaibles. The nodes are a matrix of values with each colum corresponding to a point in the set of possible samples and the weights give the probabiltiy of that point. ... Elements: - nodes: a matrix of values - weights: a vector of weights
ValueFunctionIterations.RegularGridBspline
— MethodRegularGridBspline(N::Int,grids; kwargs...)
Defines a value function that combines a discrete and a continuous state variables ...
Arguments
- dims...: a regular grid for each state variable.
Key words:
- v0: Initial value, defaults to 0.
- order: order of BSpline approximation, defaults to BSpline(Cubic(Line(OnGrid())))
- extrap: extrapolation method, defaults to Flat().
ValueFunctionIterations.GaussHermiteRandomVariable
— MethodGaussHermiteRandomVariable(m::Int64,mu::AbstractVector{Float64},Cov::AbstractMatrix{Float64})
Returns an RandomVariable with weights and nodes for a multivariate normal distribution with covariance matrix Cov and mean vector mu. The weights and nodes are chosen using a guass hermite quadrature scheme.
ValueFunctionIterations.MarkovChain
— MethodMarkovChain(p)
Builds RandomVariable object that represents a markov chain and is compatable with ValueFunctionIterations.jl. The goal of this object is to take expectations over the outcome of a markov chain in the most efficnet possible way using the RandomVariables.jl interface.
The RandomVariable object is definged to work with the sample_markov_chain
function. The nodes are intended to be passed to the sample_markov_chain
function as the random number argumet. If the nodes are sampled using the weights stored in the random varible object and passed to sample_markov_chain
the results wil be the same as if a uniform random number was sampled.
This allows the weights and nodes to be used to calcualte expectations over the outcome of the markov chain. with the minimum number of computations possible without changing the weights as a function of the current state. ...
Arguments:
- p: a transition probability matrix of size m
Values:
- a RandomVariable object
ValueFunctionIterations.action_space
— Methodaction_space_product(U...)
Returns the cartesian product of the action spaces U defined at abstract rance objects. ...
Arguments
- U: two or more AbstractRange objects
Values
- a matrix with the cartesian product of the action spaces U.
ValueFunctionIterations.estimate_time
— Methodestimate_time(DP::DynamicProgram)
Estimate how long a dynamic program will take to solve. ...
Arguments:
- DP: a DynamicProgram object
Value
- a dictionary with keys "Estimate", "One call", "Number of computations", "Estimated iterations"
ValueFunctionIterations.get_actions_function
— Methodget_value_function(DP::DynamicProgram)
Returns the action space of a dynamic program DP
. ...
Arguments:
- DP: a DynamicProgram object
Values:
- a matrix of size dims(u) by N actions
ValueFunctionIterations.get_discount_factor_function
— Methodget_value_function(DP::DynamicProgram)
Returns the discount factor for a the dynamic program DP
. ...
Arguments:
- DP: a DynamicProgram object
Values:
- a Float object
ValueFunctionIterations.get_parameters_function
— Methodget_value_function(DP::DynamicProgram)
Returns the paramters of the dynamic program DP
. ...
Arguments:
- DP: a DynamicProgram object
Values:
- a ComponentVector of model parameters
ValueFunctionIterations.get_policy_function
— Methodget_value_function(DP::DynamicProgram)
Returns the policy function of the dynamic program DP
. ...
Arguments:
- DP: a DynamicProgram object
Values:
- an AbstractValueFunction object
ValueFunctionIterations.get_random_variables_function
— Methodget_value_function(DP::DynamicProgram)
Returns the random variable for a the dynamic program DP
. ...
Arguments:
- DP: a DynamicProgram object
Values:
- an AbstractRandomVariable object
ValueFunctionIterations.get_reward_function
— Methodget_value_function(DP::DynamicProgram)
Returns the reward function of the dynamic program DP
.
Arguments: - DP: a DynamicProgram object
Values: - a function with arguments (s,u,X,p)
ValueFunctionIterations.get_update_function
— Methodget_value_function(DP::DynamicProgram)
Returns the state update function of the dynamic program DP
. ...
Arguments:
- DP: a DynamicProgram object
Values:
- a function with arguments (s,u,X,p)
ValueFunctionIterations.get_value_function
— Methodget_value_function(DP::DynamicProgram)
Returns the value function of the dynamic program DP
. ...
Arguments:
- DP: a DynamicProgram object
Values:
- an AbstractValueFunction object
ValueFunctionIterations.load_solution
— Methodload_solution(DP,filename)
Load the value and policy functions at filename
, adn over write the policy and value functions in the dynamic program DP
.
ValueFunctionIterations.product
— Methodproduct(X:: RandomVariable, Y:: RandomVariable)
Returns a RandomVariable that is the cartesian product of two independent RandomVariables. ...
Arguments
- X: a RandomVariable
- Y: a RandomVariable
Values
- a RandomVariable
ValueFunctionIterations.sample_discrete
— Methodsample_discrete(p,rng)
Samples an index 1:n with probability mass for each index given by p, given a draw from a uniform random variable rng on the set (0,1). ...
Arguments:
- p: a vector of probabilities
- rng: a number in [0,1]
Values:
- an integer in 1:length(p)
ValueFunctionIterations.sample_markov_chain
— Methodsample_markov_chain(x,p,rng)
Samples from a markov chain given the current state x and the transition matrix p using a uniform random variable rng on the unit interval. ...
Arguments:
- x: the current state (integer in 1:m)
- p: a transition probability matrix of size m
- rng: a number in [0,1]
Values:
- an integer in 1:m
ValueFunctionIterations.save_solution
— Methodsave_solution(DP,filename)
Saves the value and policy functions of the dynamic program DP
to the file filename
.
ValueFunctionIterations.simulate
— Methodsimulate(DP::DynamicProgram,T::Int)
Simulates the dynamic program DP
under the optimal policy for T
timesteps. The function returns the states in a matrix of size dims(s) by T+1, action in a matrix of size dim(u) by T,reward in a vector of size T, and value in a vector of size T+1. ...
Arguments:
- DP: a DynamicProgram object
- T: the number of timesteps to simulate
Values:
- the states in a matrix of size dims(s) by T+1
- the actions in a matrix of size dim(u) by T
- the rewards in a vector of size T
- the values in a vector of size T+1
ValueFunctionIterations.simulate
— Methodsimulate(DP::DynamicProgram,X::MCRandomVariable,T::Int)
Simulates the dynamic program DP
under the optimal policy for T
timesteps. Sampling the random variables at each timestep using the sampler in X. The function returns the states in a matrix of size dims(s) by T+1, action in a matrix of size dim(u) by T,reward in a vector of size T, and value in a vector of size T+1. ...
Arguments:
- DP: a DynamicProgram object
- X: a MCRandomVariable object
- T: the number of timesteps to simulate
Values:
- the states in a matrix of size dims(s) by T+1
- the actions in a matrix of size dim(u) by T
- the rewards in a vector of size T
- the values in a vector of size T+1
ValueFunctionIterations.solve!
— Methodsolve!(DP::DynamicProgram; kwrds...)
Rus the VFI algorithm to solve the dynamic program DP
and updates the value and policy fuctions in place ...
Arguments:
- DP: a DynamicProgram object
Key words:
- order_policy: the order of the interpolation for the policyfunction, defaults to Constant()
- tolerance: the tolerance for VFI convergence, defaults to 1e-5
- maxiter: the maximum number of iterations, defaults to round(Int, 3/(1-δ))