Black-Box Stress Testing
Unhide for installation (waiting on Julia registry).
xxxxxxxxxx
using Distributions, Parameters, POMDPStressTesting, Latexify, PlutoUI
To find failures in a black-box autonomous system, we can use the POMDPStressTesting
package which is part of the POMDPs.jl ecosystem.
Various solvers—which adhere to the POMDPs.jl interface—can be used:
MCTSPWSolver
(MCTS with action progressive widening)TRPOSolver
andPPOSolver
(deep reinforcement learning policy optimization)CEMSolver
(cross-entropy method)RandomSearchSolver
Simple Problem: One-Dimensional Walk
We define a simple problem for adaptive stress testing (AST) to find failures. This problem, called Walk1D, samples random walking distances from a standard normal distribution
Gray-Box Simulator and Environment
The simulator and environment are treated as gray-box because we need access to the state-transition distributions and their associated likelihoods.
Parameters
First, we define the parameters of our simulation.
xxxxxxxxxx
mutable struct Walk1DParams
startx::Float64 = 0 # Starting x-position
threshx::Float64 = 10 # +- boundary threshold
endtime::Int64 = 30 # Simulate end time
end;
Simulation
Next, we define a GrayBox.Simulation
structure.
xxxxxxxxxx
mutable struct Walk1DSim <: GrayBox.Simulation
params::Walk1DParams = Walk1DParams() # Parameters
x::Float64 = 0 # Current x-position
t::Int64 = 0 # Current time ±
distribution::Distribution = Normal(0, 1) # Transition distribution
end;
GrayBox.environment
Then, we define our GrayBox.Environment
distributions. When using the ASTSampleAction
, as opposed to ASTSeedAction
, we need to provide access to the sampleable environment.
xxxxxxxxxx
GrayBox.environment(sim::Walk1DSim) = GrayBox.Environment(:x => sim.distribution)
GrayBox.transition!
We override the transition!
function from the GrayBox
interface, which takes an environment sample as input. We apply the sample in our simulator, and return the log-likelihood.
xxxxxxxxxx
function GrayBox.transition!(sim::Walk1DSim, sample::GrayBox.EnvironmentSample)
sim.t += 1 # Keep track of time
sim.x += sample[:x].value # Move agent using sampled value from input
return logpdf(sample)::Real # Summation handled by `logpdf()`
end
Black-Box System
The system under test, in this case a simple single-dimensional moving agent, is always treated as black-box. The following interface functions are overridden to minimally interact with the system, and use outputs from the system to determine failure event indications and distance metrics.
BlackBox.initialize!
Now we override the BlackBox
interface, starting with the function that initializes the simulation object. Interface functions ending in !
may modify the sim
object in place.
xxxxxxxxxx
function BlackBox.initialize!(sim::Walk1DSim)
sim.t = 0
sim.x = sim.params.startx
end
BlackBox.distance
We define how close we are to a failure event using a non-negative distance metric.
xxxxxxxxxx
BlackBox.distance(sim::Walk1DSim) = max(sim.params.threshx - abs(sim.x), 0)
BlackBox.isevent
We define an indication that a failure event occurred.
xxxxxxxxxx
BlackBox.isevent(sim::Walk1DSim) = abs(sim.x) ≥ sim.params.threshx
BlackBox.isterminal
Similarly, we define an indication that the simulation is in a terminal state.
xxxxxxxxxx
function BlackBox.isterminal(sim::Walk1DSim)
return BlackBox.isevent(sim) || sim.t ≥ sim.params.endtime
end
BlackBox.evaluate!
Lastly, we use our defined interface to evaluate the system under test. Using the input sample, we return the log-likelihood, distance to an event, and event indication.
xxxxxxxxxx
function BlackBox.evaluate!(sim::Walk1DSim, sample::GrayBox.EnvironmentSample)
logprob::Real = GrayBox.transition!(sim, sample) # Step simulation
d::Real = BlackBox.distance(sim) # Calculate miss distance
event::Bool = BlackBox.isevent(sim) # Check event indication
return (logprob::Real, d::Real, event::Bool)
end
AST Setup and Running
Setting up our simulation, we instantiate our simulation object and pass that to the Markov decision proccess (MDP) object of the adaptive stress testing formulation. We use Monte Carlo tree search (MCTS) with progressive widening on the action space as our solver. Hyperparameters are passed to MCTSPWSolver
, which is a simple wrapper around the POMDPs.jl implementation of MCTS. Lastly, we solve the MDP to produce a planner. Note we are using the ASTSampleAction
.
xxxxxxxxxx
function setup_ast(seed=0)
# Create gray-box simulation object
sim::GrayBox.Simulation = Walk1DSim()
# AST MDP formulation object
mdp::ASTMDP = ASTMDP{ASTSampleAction}(sim)
mdp.params.debug = true # record metrics
mdp.params.top_k = 10 # record top k best trajectories
mdp.params.seed = seed # set RNG seed for determinism
# Hyperparameters for MCTS-PW as the solver
solver = MCTSPWSolver(n_iterations=1000, # number of algorithm iterations
exploration_constant=1.0, # UCT exploration
k_action=1.0, # action widening
alpha_action=0.5, # action widening
depth=sim.params.endtime) # tree depth
# Get online planner (no work done, yet)
planner = solve(solver, mdp)
return planner
end;
Searching for Failures
After setup, we search for failures using the planner and output the best action trace.
xxxxxxxxxx
planner = setup_ast();
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(0.015328, -0.919056)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(0.28838, -0.96052)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.19606, -1.63422)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(0.609811, -1.10487)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(0.843249, -1.27447)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.47323, -2.00414)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(2.37764, -3.74553)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.23179, -1.67759)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.14347, -1.5727)))
ASTSampleAction(Dict{Symbol,POMDPStressTesting.AST.GrayBox.Sample}(:x=>Sample{Float64}(1.31136, -1.77877)))
Playback
We can also playback specific trajectories and print intermediate
0.0
0.015328
0.303708
1.49977
2.10958
2.95283
4.42605
6.8037
8.03549
9.17896
10.4903
xxxxxxxxxx
playback_trace = playback(planner, action_trace, sim->sim.x, return_trace=true)
0.8159216715195341
xxxxxxxxxx
failure_rate = print_metrics(planner)
Other Solvers: Cross-Entropy Method
We can easily take our ASTMDP
object (planner.mdp
) and re-solve the MDP using a different solver—in this case the CEMSolver
.
xxxxxxxxxx
mdp = planner.mdp; # reused from above
CEMSolver
n_iterations: Int64 1000
episode_length: Int64 30
num_samples: Int64 100
min_elite_samples: Int64 10
max_elite_samples: Int64 9223372036854775807
elite_thresh: Float64 -0.99
weight_fn: #16 (function of type POMDPStressTesting.var"#16#22")
add_entropy: #17 (function of type POMDPStressTesting.var"#17#23")
show_progress: Bool true
verbose: Bool false
xxxxxxxxxx
cem_solver = CEMSolver(n_iterations=1000, episode_length=mdp.sim.params.endtime)
xxxxxxxxxx
cem_planner = solve(cem_solver, mdp);
xxxxxxxxxx
cem_action_trace = search!(cem_planner);
Notice the failure rate is about 10x of MCTSPWSolver
.
12.882493795314756
xxxxxxxxxx
cem_failure_rate = print_metrics(cem_planner)
AST Reward Function
The AST reward function gives a reward of
xxxxxxxxxx
function R(p,e,d,τ)
if τ && e
return 0
elseif τ && !e
return -d
else
return log(p)
end
end
References
Robert J. Moss, Ritchie Lee, Nicholas Visser, Joachim Hochwarth, James G. Lopez, and Mykel J. Kochenderfer, "Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems", Digital Avionics Systems Conference, 2020.
POMDPStressTesting.jl
x
PlutoUI.TableOfContents("POMDPStressTesting.jl")