Data Collection

The complete recording workflow for the LinkerBot O6. Leader-follower teleoperation, LeRobot dataset format, episode quality checklist, and links to the broader data collection pipeline.

Recording Workflow for the O6

The complete process from hardware ready to first dataset episode. Follow steps in order.

1

Verify hardware & CAN interface

Confirm the O6 is mounted, powered on, and the CAN interface is up. Run candump can0 and verify motor heartbeat packets appear before continuing.

2

Configure task and camera layout

Define the task description, set up the camera(s) at the correct angles, and place task objects in the workspace. A consistent scene setup across episodes is critical for policy generalization.

3

Start the recording session

Launch the LeRobot control script with the record mode. This arms the system for episode recording. The session will await your start trigger before capturing.

python -m lerobot.scripts.control_robot \
  --robot.type=linkerbot_o6 \
  --control.type=record \
  --control.fps=30 \
  --control.repo_id=your-username/o6-task-name \
  --control.num_episodes=50 \
  --control.single_task="Pick up the blue block"
4

Teleoperate and record episodes

Use a leader arm (or your keyboard for simple tests) to teleoperate the O6. Press the start/stop key to bracket each episode. Reset the scene between episodes for consistency.

5

Review and filter episodes

Use the LeRobot replay tool to review each episode visually. Discard any that fail the quality checklist below. Quality over quantity: 30 excellent episodes beats 100 mediocre ones.

python -m lerobot.scripts.control_robot \
  --robot.type=linkerbot_o6 \
  --control.type=replay \
  --control.repo_id=your-username/o6-task-name \
  --control.episode=0
6

Upload to HuggingFace Hub

Push your filtered dataset to the HuggingFace Hub for sharing and training. The dataset is immediately available for policy training in LeRobot.

huggingface-cli login
python -m lerobot.scripts.push_dataset_to_hub \
  --repo_id=your-username/o6-task-name

LeRobot Dataset Format for O6

Each recorded episode is stored in the standard LeRobot HuggingFace dataset format. This format is directly compatible with ACT, Diffusion Policy, and all other LeRobot-supported training algorithms.

Episode structure

dataset/
  data/
    episode_000000/
      observation.state.npy     # [T, 12] — 6 joint positions + 6 velocities
      action.npy                # [T, 6]  — 6 target joint positions
      observation.images.wrist_cam/
        frame_000000.png        # 640x480 @ 30 fps
        ...
      observation.images.overhead_cam/
        frame_000000.png
        ...
      episode.json              # {task, success, duration_s, num_frames}
  meta_data/
    info.json                   # dataset schema version, robot type, fps
    stats.json                  # per-channel mean, std, min, max

State and action dimensions

# observation.state: [T, 12]
# Columns: [j0_pos, j1_pos, j2_pos, j3_pos, j4_pos, j5_pos,
#            j0_vel, j1_vel, j2_vel, j3_vel, j4_vel, j5_vel]
# Units: radians and radians/second

# action: [T, 6]
# Columns: [j0_target, j1_target, j2_target, j3_target, j4_target, j5_target]
# Units: radians

Train a policy from your O6 dataset

python -m lerobot.scripts.train \
  --dataset_repo_id=your-username/o6-task-name \
  --policy.type=act \
  --output_dir=./checkpoints/o6-act-v1 \
  --training.num_epochs=100

Episode Quality Checklist

Apply this checklist to every episode before including it in your training dataset. Bad data is worse than less data.

  • Task completed successfully — the arm reached the goal state without human intervention. No partial completions.
  • Motion is smooth and deliberate — no jerky corrections, overshoots, or sudden direction changes. Smooth demonstrations train smoother policies.
  • All camera frames present — no dropped frames, no occlusions of the task-relevant workspace region.
  • Joint states are continuous — no timestep gaps greater than 40 ms in the state log.
  • Episode duration is consistent — episodes shorter than 3 s or longer than 30 s are usually outliers. Review them before including.
  • Scene was reset identically — task objects were returned to the same starting position before the episode began.
  • No CAN errors during recording — check candump can0 logs for error frames during the session.
Data Collection Pipeline Overview →

Ready to Train?

Follow the LinkerBot O6 learning path for the complete setup-to-training workflow.