Core Concepts#

Understanding these core concepts will help you work effectively with GenesisLab.

LabScene#

LabScene is the central simulation manager that orchestrates all components.

Key Characteristics:

  • Manages a Genesis scene with multiple parallel environments

  • Coordinates all managers (observation, action, reward, etc.)

  • Implements the MDP (Markov Decision Process) interface

  • Provides Gymnasium-compatible reset/step functions

Example:

from genesislab.engine import LabScene

scene = LabScene(cfg, num_envs=4096)
obs, info = scene.reset()
obs, rew, term, trunc, info = scene.step(actions)

Managers#

Managers are specialized components that handle specific aspects of the learning problem.

Manager Types#

ObservationManager#

Computes observation tensors from individual observation terms.

@configclass
class ObservationManagerCfg:
    base_lin_vel = ObservationTermCfg(func=obs_funcs.base_lin_vel)
    joint_pos = ObservationTermCfg(func=obs_funcs.joint_pos)

Key Features:

  • Policy observations (obs) and privileged observations (obs_critic)

  • Noise injection for robustness

  • Observation history and stacking

RewardManager#

Computes total reward from weighted reward terms.

@configclass  
class RewardManagerCfg:
    forward_vel = RewardTermCfg(func=rew_funcs.forward_vel, weight=1.0)
    energy = RewardTermCfg(func=rew_funcs.energy_penalty, weight=-0.001)

Key Features:

  • Per-term weighting

  • Reward logging and analysis

  • Optional reward clipping

ActionManager#

Processes actions from the policy and applies them to the robot.

@configclass
class ActionManagerCfg:
    joint_positions = ActionTermCfg(
        func=action_funcs.joint_position_action,
        scale=0.25,
        clip=(-1.0, 1.0)
    )

Key Features:

  • Action scaling and clipping

  • Multiple action types (position, velocity, torque)

  • Action history for observations

CommandManager#

Generates and manages command targets for the robot.

@configclass
class CommandManagerCfg:
    base_velocity = CommandTermCfg(
        func=cmd_funcs.uniform_velocity_command,
        ranges=VelocityRanges(lin_x=(0.0, 1.0), ang_z=(-1.0, 1.0))
    )

Key Features:

  • Command resampling on reset

  • Curriculum-based command ranges

  • Multiple command types

TerminationManager#

Checks conditions for episode termination.

@configclass
class TerminationManagerCfg:
    timeout = TimeoutTermCfg(max_time=20.0)
    base_height = TerminationTermCfg(
        func=term_funcs.base_height_below,
        params={"threshold": 0.25}
    )

Key Features:

  • Distinction between termination and truncation

  • Per-environment termination tracking

  • Automatic episode statistics

CurriculumManager#

Implements progressive training curricula.

@configclass
class CurriculumManagerCfg:
    terrain_difficulty = CurriculumTermCfg(
        func=curriculum_funcs.terrain_levels,
        start_level=0,
        end_level=10,
        num_steps=10000
    )

EventManager#

Handles domain randomization and episodic events.

@configclass
class EventManagerCfg:
    reset_base = EventTermCfg(
        func=event_funcs.reset_root_state,
        mode="reset"
    )
    
    randomize_mass = EventTermCfg(
        func=event_funcs.randomize_rigid_body_mass,
        mode="interval",
        interval_steps=250,
        params={"mass_range": (0.8, 1.2)}
    )

Terms#

Terms are the basic computation units within managers.

Term Structure#

Each term consists of:

  1. Function: The actual computation

  2. Configuration: Parameters for the function

  3. Metadata: Weight, noise, etc.

Example: Observation Term#

def joint_pos(scene: LabScene) -> torch.Tensor:
    """Get joint positions of the robot."""
    return scene.robot.get_joint_positions()

@configclass
class ObservationTermCfg:
    func: Callable = joint_pos
    noise: NoiseCfg | None = None
    scale: float = 1.0

Example: Reward Term#

def forward_velocity_reward(
    scene: LabScene,
    target_velocity: float = 1.0
) -> torch.Tensor:
    """Reward for moving forward at target velocity."""
    lin_vel = scene.robot.get_base_linear_velocity()
    return torch.exp(-torch.abs(lin_vel[:, 0] - target_velocity))

@configclass
class RewardTermCfg:
    func: Callable = forward_velocity_reward
    weight: float = 1.0
    params: dict = field(default_factory=lambda: {"target_velocity": 1.0})

ConfigClass#

GenesisLab uses @configclass for all configurations.

Critical Rule#

ALWAYS use @configclass, NEVER use @dataclass.

Basic Usage#

from genesislab.utils.configclass import configclass

@configclass
class RobotCfg:
    name: str = "go2"
    num_dof: int = 12
    motor_strength: float = 1.0

Nested Configurations#

@configclass
class TaskCfg:
    robot: RobotCfg = RobotCfg()
    scene: LabSceneCfg = LabSceneCfg(num_envs=4096)
    
    observations: ObservationManagerCfg = ObservationManagerCfg(
        base_lin_vel=ObservationTermCfg(...),
        joint_pos=ObservationTermCfg(...)
    )

Scene Builder/Controller/Querier#

These are three patterns for interacting with the Genesis scene.

SceneBuilder#

Purpose: Construct the initial scene during setup.

class MySceneBuilder:
    def build(self, scene: gs.Scene):
        # Add terrain
        self.build_terrain(scene)
        
        # Add robot
        self.build_robot(scene)
        
        # Add sensors
        self.build_sensors(scene)

Used in: LabScene.setup_scene()

SceneController#

Purpose: Modify scene state during simulation.

class MySceneController:
    def apply_actions(self, scene: gs.Scene, actions: torch.Tensor):
        # Apply motor commands
        scene.robot.set_dofs_position_target(actions)
    
    def reset_robot(self, scene: gs.Scene, env_ids: torch.Tensor):
        # Reset robot state for specified environments
        scene.robot.set_dofs_position(default_pos, env_ids)

Used in: LabScene.step(), LabScene.reset()

SceneQuerier#

Purpose: Read data from the scene without modifying it.

class MySceneQuerier:
    def get_robot_state(self, scene: gs.Scene) -> dict:
        return {
            "joint_pos": scene.robot.get_dofs_position(),
            "joint_vel": scene.robot.get_dofs_velocity(),
            "base_pos": scene.robot.get_pos(),
        }

Used in: Observation terms, reward terms

Sensors#

GenesisLab supports two types of sensors:

Fake Sensors#

Fast, simple sensors implemented in PyTorch:

  • Compute measurements from robot state

  • No rendering or ray-tracing

  • Examples: IMU, joint encoders

@configclass
class FakeIMUCfg(SensorBaseCfg):
    sensor_type: str = "fake"
    update_period: float = 0.01  # 100 Hz

Genesis Sensors#

Native Genesis sensors with high fidelity:

  • Use Genesis rendering/ray-tracing

  • More realistic but slower

  • Examples: cameras, LiDAR

@configclass
class CameraSensorCfg(SensorBaseCfg):
    sensor_type: str = "genesis"
    width: int = 640
    height: int = 480
    fov: float = 90.0

Terrains#

GenesisLab supports various terrain types:

Flat Terrain#

Simple flat ground for basic training.

Rough Terrain#

Procedurally generated height maps with:

  • Random noise

  • Stepping stones

  • Stairs

  • Slopes

Mesh Terrain#

Custom mesh terrains loaded from files.

Curriculum#

Terrains can be arranged in difficulty levels:

@configclass
class TerrainCfg:
    curriculum: bool = True
    num_levels: int = 10
    terrain_types: list[str] = ["flat", "rough", "stairs", "slopes"]

Vectorization#

All operations in GenesisLab are vectorized across environments:

# Data shape: [num_envs, ...]
num_envs = 4096

# Actions: [num_envs, num_actions]
actions = policy(obs)  # [4096, 12]

# Observations: [num_envs, num_obs]
obs = scene.get_observations()  # [4096, 48]

# Rewards: [num_envs]
rewards = scene.get_rewards()  # [4096]

# Terminations: [num_envs]
terminated = scene.get_terminations()  # [4096]

MDP (Markov Decision Process)#

GenesisLab implements the standard RL MDP interface:

State Space#

  • Observation: \(o_t \in \mathcal{O}\)

  • Privileged Info: Additional information for the critic (asymmetric actor-critic)

Action Space#

  • Continuous: Joint position/velocity/torque targets

  • Discrete: (future support)

Reward Function#

  • \(r_t = \sum_i w_i \cdot r_i(s_t, a_t, s_{t+1})\)

  • Sum of weighted reward terms

Transition Dynamics#

  • Defined by Genesis physics simulation

  • Deterministic given state and action

  • Randomness from domain randomization events

Termination Conditions#

  • Terminated: Episode ends due to failure (e.g., robot falls)

  • Truncated: Episode ends due to time limit

Next Steps#