Skip to content

Cameras

The agent discovers your robot's cameras and streams them for live teleoperation.

Discovery and selection

The agent uses V4L2 (the standard Linux video interface) to:

  • Discover connected cameras.
  • Enumerate capabilities for each device — supported pixel formats and resolutions.
  • Auto-pick sensible cameras using a built-in heuristic when you have not pinned specific devices.

You can override the automatic choice by configuring specific cameras in robot.toml. See Configuration and Configure a camera. A detailed reference of the camera field names and auto-pick priority order is coming soon.

Supported sources

The pipeline accepts the common V4L2 source formats:

  • Raw (e.g. YUY2, I420, NV12)
  • MJPEG
  • H.264

Each is normalized into a common format before encoding, so different cameras can be mixed.

The video pipeline

For teleop, frames flow through GStreamer and are encoded to VP8, then delivered to the operator over WebRTC:

V4L2 camera  ->  GStreamer (decode/convert)  ->  VP8 encode  ->  WebRTC track

The encoder is tuned for low latency rather than maximum quality, which keeps teleoperation responsive.

Dual-track and live switching

  • The agent supports two simultaneous video tracks, so an operator can view two cameras at once.
  • Camera selection is live-switchable: the operator can change which camera feeds a track during a session without tearing down the connection (a make-before-break switch on each track slot).

Teleop sessions start and stop dynamically as operators connect and disconnect.