Sample Factory version 2 introduces a new feature: you can wrap the RL algorithm with a custom
which allows you to interact with the RL training process in an arbitrary way.
AlgoObserver interface is defined as follows:
class AlgoObserver: def on_init(self, runner: Runner) -> None: """Called after ctor, but before signal-slots are connected or any processes are started.""" pass def on_connect_components(self, runner: Runner) -> None: """Connect additional signal-slot pairs in the observers if needed.""" pass def on_start(self, runner: Runner) -> None: """Called right after sampling/learning processes are started.""" pass def on_training_step(self, runner: Runner, training_iteration_since_resume: int) -> None: """Called after each training step.""" pass def extra_summaries(self, runner: Runner, policy_id: PolicyID, env_steps: int, writer: SummaryWriter) -> None: pass def on_stop(self, runner: Runner) -> None: pass
Define your own class derived from
MyObserver) and register it before starting the training process:
Our DMLab integration provides an example of how to use
AlgoObserver to implement custom summaries that
aggregate information from multiple custom metrics (see
AlgoObserver is a new feature and further suggestions/extensions are welcome!
Created: June 30, 2023