Black-Box Stress Testing

7.7 μs

Unhide for installation (waiting on Julia registry).

6.6 μs
2.9 ms

To find failures in a black-box autonomous system, we can use the POMDPStressTesting package which is part of the POMDPs.jl ecosystem.

Various solvers—which adhere to the POMDPs.jl interface—can be used:

  • MCTSPWSolver (MCTS with action progressive widening)

  • TRPOSolver and PPOSolver (deep reinforcement learning policy optimization)

  • CEMSolver (cross-entropy method)

  • RandomSearchSolver

13.6 μs

Simple Problem: One-Dimensional Walk

We define a simple problem for adaptive stress testing (AST) to find failures. This problem, called Walk1D, samples random walking distances from a standard normal distribution N(0,1) and defines failures as walking past a certain threshold (which is set to ±10 in this example). AST will either select the seed which deterministically controls the sampled value from the distribution (i.e. from the transition model) or will directly sample the provided environmental distributions. These action modes are determined by the seed-action or sample-action options. AST will guide the simulation to failure events using a notion of distance to failure, while simultaneously trying to find the set of actions that maximizes the log-likelihood of the samples.

15 μs

Gray-Box Simulator and Environment

The simulator and environment are treated as gray-box because we need access to the state-transition distributions and their associated likelihoods.

5.6 μs
Parameters

First, we define the parameters of our simulation.

4.2 μs
2.7 ms
Simulation

Next, we define a GrayBox.Simulation structure.

4.5 μs
2.6 ms

GrayBox.environment

Then, we define our GrayBox.Environment distributions. When using the ASTSampleAction, as opposed to ASTSeedAction, we need to provide access to the sampleable environment.

9.5 μs
16.3 μs

GrayBox.transition!

We override the transition! function from the GrayBox interface, which takes an environment sample as input. We apply the sample in our simulator, and return the log-likelihood.

6.9 μs
27.7 μs

Black-Box System

The system under test, in this case a simple single-dimensional moving agent, is always treated as black-box. The following interface functions are overridden to minimally interact with the system, and use outputs from the system to determine failure event indications and distance metrics.

7.1 μs

BlackBox.initialize!

Now we override the BlackBox interface, starting with the function that initializes the simulation object. Interface functions ending in ! may modify the sim object in place.

6.6 μs
23 μs

BlackBox.distance

We define how close we are to a failure event using a non-negative distance metric.

6.6 μs
22.1 μs

BlackBox.isevent

We define an indication that a failure event occurred.

6.3 μs
18.3 μs

BlackBox.isterminal

Similarly, we define an indication that the simulation is in a terminal state.

5.3 μs
44.4 μs

BlackBox.evaluate!

Lastly, we use our defined interface to evaluate the system under test. Using the input sample, we return the log-likelihood, distance to an event, and event indication.

5.6 μs
34.7 μs

AST Setup and Running

Setting up our simulation, we instantiate our simulation object and pass that to the Markov decision proccess (MDP) object of the adaptive stress testing formulation. We use Monte Carlo tree search (MCTS) with progressive widening on the action space as our solver. Hyperparameters are passed to MCTSPWSolver, which is a simple wrapper around the POMDPs.jl implementation of MCTS. Lastly, we solve the MDP to produce a planner. Note we are using the ASTSampleAction.

6.4 μs
66.4 μs

Searching for Failures

After setup, we search for failures using the planner and output the best action trace.

6.4 μs
218 ms
action_trace
2.6 s

Playback

We can also playback specific trajectories and print intermediate x-values.

5.3 μs
playback_trace
205 ms
failure_rate
0.8159216715195341
120 ms

Other Solvers: Cross-Entropy Method

We can easily take our ASTMDP object (planner.mdp) and re-solve the MDP using a different solver—in this case the CEMSolver.

6.5 μs
1.6 ms
cem_solver
CEMSolver
  n_iterations: Int64 1000
  episode_length: Int64 30
  num_samples: Int64 100
  min_elite_samples: Int64 10
  max_elite_samples: Int64 9223372036854775807
  elite_thresh: Float64 -0.99
  weight_fn: #16 (function of type POMDPStressTesting.var"#16#22")
  add_entropy: #17 (function of type POMDPStressTesting.var"#17#23")
  show_progress: Bool true
  verbose: Bool false
31 ms
6.3 ms
5.5 s

Notice the failure rate is about 10x of MCTSPWSolver.

2.9 μs
cem_failure_rate
12.882493795314756
3.3 ms

AST Reward Function

The AST reward function gives a reward of 0 if an event is found, a reward of negative distance if no event is found at termination, and the log-likelihood during the simulation.

6.8 μs

R(p,e,d,τ)={0if τedif τ¬elog(p)otherwise

1.8 s

References

  1. Robert J. Moss, Ritchie Lee, Nicholas Visser, Joachim Hochwarth, James G. Lopez, and Mykel J. Kochenderfer, "Adaptive Stress Testing of Trajectory Predictions in Flight Management Systems", Digital Avionics Systems Conference, 2020.

6 μs
13.7 μs