Design Philosophy#
GenesisLab is built on several core design principles that guide its development.
1. Modularity Through Managers#
Principle: Separate concerns through specialized manager components.
Rationale:
Each manager handles one aspect of the learning problem
Managers can be developed and tested independently
Easy to swap implementations without affecting other components
Example:
# Each manager is independent
class LabScene:
observation_manager: ObservationManager
action_manager: ActionManager
reward_manager: RewardManager
# ... more managers
Benefits:
Clear code organization
Easy to understand and maintain
Facilitates team collaboration
2. Composition Over Inheritance#
Principle: Build complex behaviors through term composition, not class hierarchies.
Rationale:
Deep inheritance hierarchies are hard to understand and modify
Composition allows mixing and matching behaviors
More flexible and maintainable
Example:
# Instead of: class MyReward(BaseReward)
# Use composition:
@configclass
class RewardManagerCfg:
# Compose multiple reward terms
forward_vel = RewardTermCfg(func=forward_vel_reward, weight=1.0)
energy = RewardTermCfg(func=energy_penalty, weight=-0.001)
smoothness = RewardTermCfg(func=smoothness_reward, weight=0.5)
Benefits:
Easy to add/remove terms
Clear configuration
Reusable term functions
3. Configuration-Driven Design#
Principle: Define behavior through configuration, not code.
Rationale:
Experiments can be defined by changing configuration
No need to modify code for common variations
Easy to version and reproduce experiments
Example:
# Define a task entirely through configuration
@configclass
class Go2FlatTaskCfg:
scene: LabSceneCfg = LabSceneCfg(num_envs=4096)
robot: RobotCfg = Go2RobotCfg()
terrain: TerrainCfg = FlatTerrainCfg()
observations: ObservationManagerCfg = Go2ObservationsCfg()
rewards: RewardManagerCfg = LocomotionRewardsCfg()
Benefits:
Reproducible experiments
Easy to share configurations
Version control friendly
4. Performance First#
Principle: Leverage parallelization and vectorization for maximum speed.
Rationale:
RL requires millions of samples
Fast simulation enables rapid iteration
Genesis provides unprecedented parallel simulation capability
Implementation:
# All operations are vectorized
num_envs = 4096
# Parallel simulation
obs = scene.step(actions) # All envs step in parallel
# Batch computation
rewards = reward_manager.compute() # Vectorized across envs
Benefits:
Train policies in minutes, not hours
More experiments in less time
Faster research iteration
5. Sensible Defaults#
Principle: Provide good default configurations that work out-of-the-box.
Rationale:
Lower barrier to entry
Quick prototyping
Based on best practices from research
Example:
# Pre-configured tasks with sensible defaults
env = gym.make("GenesisLab-Go2-Flat-v0") # Just works!
# But still customizable
env = gym.make("GenesisLab-Go2-Flat-v0", num_envs=8192, cfg=custom_cfg)
Benefits:
Easy for beginners
Fast prototyping
Embeds best practices
6. Extensibility#
Principle: Easy to extend with custom components.
Rationale:
Every project has unique requirements
Framework shouldn’t constrain research
Extension should be straightforward
Extension Points:
# 1. Custom observation terms
def my_observation(scene: LabScene) -> torch.Tensor:
return compute_custom_observation(scene)
# 2. Custom reward terms
def my_reward(scene: LabScene, params: dict) -> torch.Tensor:
return compute_custom_reward(scene, params)
# 3. Custom managers
class MyCustomManager(ManagerBase):
def compute(self) -> Any:
return custom_computation()
# 4. Custom tasks
class MyTask(LocomotionTask):
def setup_managers(self):
# Configure custom managers
pass
Benefits:
Supports novel research
Easy to experiment
Don’t need to fork the codebase
7. Compatibility#
Principle: Work seamlessly with existing RL ecosystems.
Rationale:
Don’t reinvent the wheel
Leverage existing tools
Easy adoption
Implementation:
# Standard Gymnasium API
import gymnasium as gym
env = gym.make("GenesisLab-Go2-Flat-v0")
# Works with popular RL libraries
from stable_baselines3 import PPO
model = PPO("MlpPolicy", env)
# Compatible with RL frameworks
from rsl_rl.runners import OnPolicyRunner
runner = OnPolicyRunner(env, train_cfg, log_dir)
Benefits:
Use familiar tools
Access to RL library ecosystem
Easier for users coming from other frameworks
8. Clarity Over Cleverness#
Principle: Write clear, explicit code over clever abstractions.
Rationale:
Code is read more than written
Debugging is harder than writing
Explicit is better than implicit
Example:
# Clear and explicit
def compute_reward(self) -> torch.Tensor:
forward_vel_rew = self.compute_forward_velocity_reward()
energy_penalty = self.compute_energy_penalty()
return forward_vel_rew - 0.001 * energy_penalty
# Avoid: Complex metaclasses, magic methods, hidden behavior
Benefits:
Easy to understand
Easy to debug
Maintainable code
9. Fail Fast and Loud#
Principle: Detect errors early and provide clear error messages.
Rationale:
Silent failures waste time
Clear errors guide users
Better development experience
Implementation:
# Validate configurations early
if num_envs <= 0:
raise ValueError(f"num_envs must be positive, got {num_envs}")
# Provide helpful error messages
if robot_name not in SUPPORTED_ROBOTS:
raise ValueError(
f"Unsupported robot '{robot_name}'. "
f"Supported robots: {list(SUPPORTED_ROBOTS.keys())}"
)
Benefits:
Faster debugging
Better user experience
Fewer silent bugs
10. Documentation as Code#
Principle: Documentation should be close to code and stay in sync.
Rationale:
Outdated docs are worse than no docs
Examples should be tested
API docs from docstrings
Implementation:
def forward_velocity_reward(
scene: LabScene,
target_velocity: float = 1.0
) -> torch.Tensor:
"""
Reward for moving forward at target velocity.
Args:
scene: The simulation scene
target_velocity: Desired forward velocity in m/s
Returns:
Reward tensor of shape [num_envs]
Example:
>>> reward = forward_velocity_reward(scene, target_velocity=1.5)
"""
...
Benefits:
Always up-to-date
Examples are tested
Better IDE support
Design Trade-offs#
Performance vs. Flexibility#
Choice: Optimize for performance while maintaining key extension points.
Rationale: RL training is compute-bound, so performance is critical. However, we preserve flexibility where it matters (custom terms, managers).
Simplicity vs. Features#
Choice: Provide sensible defaults with optional advanced features.
Rationale: Easy for beginners, powerful for experts. Don’t force users to understand everything upfront.
Explicitness vs. Boilerplate#
Choice: Favor explicit configuration over magic.
Rationale: A bit more typing is worth the clarity. Configuration files are self-documenting.
Inspirations#
GenesisLab draws inspiration from several excellent projects:
Isaac Lab: Manager-based architecture, term composition
Legged Gym / RSL RL: Locomotion task structure, reward design
Gymnasium: Standard RL interface
Genesis: Physics simulation philosophy
Anti-Patterns to Avoid#
Based on our philosophy, we avoid:
❌ Deep inheritance hierarchies → Use composition
❌ Magic configuration behavior → Be explicit
❌ Hidden global state → Pass dependencies explicitly
❌ Clever abstractions → Keep it simple
❌ Premature optimization → Optimize where it matters (simulation loop)
❌ Tight coupling → Use managers for separation
❌ Silent failures → Fail fast with clear messages
Contributing#
When contributing to GenesisLab, please follow these principles:
Modularity: Does your change fit the manager-based architecture?
Configuration: Can it be configured without code changes?
Performance: Does it maintain vectorized operations?
Compatibility: Does it work with Gymnasium API?
Documentation: Are docstrings and examples included?
Clarity: Is the code clear and explicit?
Next Steps#
See architecture for system design
Read concepts for core abstractions
Check out tutorials for practical examples