Train reinforcement learning models
stable-baselines3skillsetup L3★27,559
K-Dense-AI/scientific-agent-skills ↗What it does
Train RL agents using PPO, SAC, DQN, TD3 algorithms
Best for
Quick prototyping of standard RL problems with scikit-learn-like API and proven algorithm implementations.
Inputs
- · Gymnasium environment
- · algorithm choice
- · hyperparameters
Outputs
- · trained model
- · learning curves
- · policy
Requires
- · stable-baselines3 2.8+
- · PyTorch
- · Gymnasium
- · TensorBoard (optional)
Preconditions
Python 3.10+; Gymnasium env with action/observation spaces; compatible reward signal
Failure modes
Model divergence on sparse rewards; hyperparameter-sensitive; memory on large state spaces
Trust signals
- · stable-baselines3 2.8 (April 2026)
- · PyTorch backend
- · algorithm-specific explainers