cyberneticlibrary

Train reinforcement learning models

stable-baselines3skillsetup L327,559
K-Dense-AI/scientific-agent-skills
What it does

Train RL agents using PPO, SAC, DQN, TD3 algorithms

Best for

Quick prototyping of standard RL problems with scikit-learn-like API and proven algorithm implementations.

Inputs
  • · Gymnasium environment
  • · algorithm choice
  • · hyperparameters
Outputs
  • · trained model
  • · learning curves
  • · policy
Requires
  • · stable-baselines3 2.8+
  • · PyTorch
  • · Gymnasium
  • · TensorBoard (optional)
Preconditions

Python 3.10+; Gymnasium env with action/observation spaces; compatible reward signal

Failure modes

Model divergence on sparse rewards; hyperparameter-sensitive; memory on large state spaces

Trust signals
  • · stable-baselines3 2.8 (April 2026)
  • · PyTorch backend
  • · algorithm-specific explainers