JiangXianghua / Industrial Systems Engineering and Management; National University of Singapore
YeZhisheng / National University of Singapore;Department of Industrial Systems Engineering and Management
TangLoon-Ching / National University of Singapore;Department of Industrial Systems Engineering and Management
ZhangXun / National University of Singapore;Institute of Operations Research and Analytics
We study the multi-period risk-averse inventory control problem in a data-driven setting.
In this problem, a risk-averse retailer makes periodic decisions on inventory levels based only on historical demand observations without full knowledge of the demand distribution.
We adopt the popular nested formulation for risk-averse programs to formulate this multi-period problem and its data-driven counterpart under a coherent risk measure.
Our objective is to study the sample complexity bound such that with high probability, the data-driven policy is near-optimal, i.e., the relative error of risk under the data-driven policy compared with the optimal risk is arbitrarily small.
Analysis of this problem is inherently challenging, because the multi-period nature requires solving the risk-averse program and its data-driven version recursively backward in time, while the (empirical) risk-to-go functions in this process do not have closed-form derivatives for most risk measures, which renders existing first-order methods for the risk-neutral newsvendor model invalid.
In this study, we develop a zero-order framework to establish the complexity bound on sample sizes to guarantee near-optimality of the data-driven policy with given accuracy levels.
Instead of using first-order derivative information on the risk-to-go function, our analysis directly examines the class of functions that underpin each cumulative risk function and derives maximum inequalities for this functional class by computing the covering numbers.
Finite-sample complexity bounds are then used to establish asymptotic properties of the estimated risk, including consistency and convergence rate.
Computationally, the time complexity for solving the data-driven policy, which is essentially a sample average approximation (SAA) estimator of the optimal policy, increases exponentially in the length of the planning horizon.
To speed computation,
we propose an approximation scheme that recursively approximates the empirical cumulative risk function with a convex piecewise linear function and then minimize it to obtain a modified data-driven inventory policy.
We show that with proper control for approximation error, the modified data-driven policy is also near-optimal, and it has the same order of sample complexity bound as that for the original SAA policy.