[UR5e] direct_torque — correct control equation, gains, and max torque for VLA / RL inference (not trajectory following)

Setup:

UR5e | PolyScope 5.25.2 | ur_rtde 1.6.3 | RTX 5070 Ti | Robotiq Hand-E gripper 0.990 kg
References read: UR official direct_torque circular trajectory example, NVIDIA IndustReal (USB/gear insertion), ur_rtde API docs. None of these cover the discrete-step inference use case — hence this post.


What I am trying to do:

Deploy a Vision-Language-Action (VLA) policy on the UR5e using direct_torque instead of servoJ. The VLA sends discrete joint-position targets at 50 Hz — not a smooth trajectory, just position steps from learned inference. The torque controller runs at 500 Hz and must track those steps while staying compliant during contact so we do not get protective stops. I read the official documentation but it is written for trajectory following. I cannot find guidance for this discrete-step VLA or RL inference use case and am looking for the correct equation, gains, and limits.


1. Python ur_rtde vs URScript — which is the right path?

In ur_rtde Python, only the V1 friction API is available — friction_comp=True, which is binary with no per-joint tuning. getMassMatrix() from Python costs 7–15 ms round-trip, making real-time dynamics impossible inside a 2 ms loop. GC pauses also make 500 Hz unreliable.

In URScript, the V2 friction API is available — viscous_scale and coulomb_scale per joint, which gives much finer control. The RT loop is reliable at 500 Hz.

Is URScript the only correct path for direct_torque with proper friction tuning? If the Python path is viable, what is the right way to use it? And with V1 friction (binary, Python) vs V2 friction (per-joint, URScript), how different should the gains be?


2. What is the correct control equation for VLA / RL inference?

For trajectory following, a reference velocity is well defined. For discrete position steps from a VLA or RL policy, there is no meaningful reference velocity. We tested two damping options:

Option A: Kd times (v_reference minus qd) — reference velocity from finite-differencing targets
Option B: Kd times (0 minus qd) — brake against current velocity, no reference

Option B behaves much better for step inputs. Is this the correct formulation for this use case? What equation do people actually use for direct_torque outside of trajectory following?

Also: should Coriolis be included in the command? And does get_coriolis_and_centrifugal_torques() return a 6-element list or a 6x6 matrix? We have seen conflicting behavior when indexing it.


3. Is the mass matrix necessary, and if so, full 6x6 or diagonal only?

Is get_mass_matrix() needed for stable tracking at slow manipulation speeds, or does a well-tuned PD with V2 friction compensation give equivalent results? If the mass matrix is needed, is the diagonal approximation sufficient for the UR5e at speeds under 0.5 rad/s, or do the off-diagonal coupling terms matter enough to include?


4. What are reasonable starting-point gains for the UR5e?

With V2 friction active (viscous_scale around 0.9, coulomb_scale around 0.8), what Kp and Kd range is appropriate for direct_torque? Should gains be uniform across all joints or scaled per joint inertia? We found that scaling gains by inertia causes wrist joints to track very poorly — is uniform gain the correct approach for this robot?

With V1 friction (Python path, binary), would the gain range be significantly different?


5. What are the correct max torque limits for direct_torque on the UR5e?

We clamp to 150 Nm for joints 0–2 and 28 Nm for joints 3–5 based on the UR5e datasheet. Is this correct for direct_torque, or does the API apply its own internal limits that we should know about?


6. Does direct_torque compensate gravity internally?

Should gravity torques be added to the command, or does direct_torque handle gravity compensation automatically? Asking to confirm and avoid double-compensation.


Any guidance from UR engineers or anyone who has used direct_torque for compliant manipulation or learned policy deployment is appreciated. Specifically looking for the right equation, gain range, and torque limits for a discrete-step VLA or RL inference setup — not trajectory following.

In general: One approach (which is what you have done, as far as I understand) to use a PD controller running at a higher frequency to go to the 50 Hz targets in joint space.

We highly recommend using direct_torque v2 over v1. It offers much better friction compensation.

  1. Having this low-level controller running in URScript is from our point of view beneficial. This can still be achieved through ur_rtde, I think, by modifying the script code it uses. Note: ur_rtde is not an official library by Universal Robots, hence we cannot give any definitive statements here. We are doing similar things in our ur_client_library c++ library: Add PD controller for joint-space impedance by urfeex · Pull Request #400 · UniversalRobots/Universal_Robots_Client_Library · GitHub
  2. See this line: Add PD controller for joint-space impedance by urfeex · Pull Request #400 · UniversalRobots/Universal_Robots_Client_Library · GitHub we also use option B in our approach
  3. For that kind of PD controller we don’t use the mass matrix.
  4. We haven’t fully tuned gains, yet. An initial shot (tested with V1) can be seen at Add PD controller for joint-space impedance by urfeex · Pull Request #400 · UniversalRobots/Universal_Robots_Client_Library · GitHub. Note: All gains in there are for e-Series robots, so a UR5 in there is a UR5e. I would expect that you will have to tune the gains significantly using V2.
  5. To my knowledge, it doesn’t have its own limits, keep in mind though, that the maximum torque that you can command depends on the torque generated by gravity (see next point)
  6. direct_torque is gravity compensated. This means that sending a target torque value of [0,0,0,0,0,0] means the robot should stand still.

I hope, this helps a bit to clarify things.

Thank you @feex for the guidance. I implemented the recommended approach and ran numerous real VLA episodes with 5 controller configurations. I now have a specific tracking problem and stablization issue.


What I Implemented

  • Equation: tau[j] = Kp[j] * (ref[j] - q[j]) - Kd[j] * qd[j]
  • No mass matrix, no Coriolis (as recommended)
  • Option B damping (-Kd * qd)
  • Gains from PR #400: Kp=[900, 900, 900, 500, 500, 500], Kd=[60, 60, 60, 44.72, 44.72, 44.72]
  • Max torque: [150, 150, 150, 28, 28, 28] Nm
  • Gravity: NOT added (direct_torque compensates internally)

Python writes absolute joint targets to RTDE input registers at 50Hz from the VLA policy.


5 Controller Configurations Tested

# Loop Friction Filter Kd Key Difference
1 Python 500Hz V1 (binary) None 60 Baseline: raw targets, V1
2 Python 500Hz V1 (binary) alpha=0.05 60 Added smoothing to V1
3 URScript 500Hz V2 (per-joint) None 60 V2 friction, raw targets
4 URScript 500Hz V2 (per-joint) alpha=0.05 60 V2 + smoothing
5 URScript 500Hz V2 (per-joint) alpha=0.05 90 V2 + smoothing + 1.5*Kd

V2 friction parameters: viscous_scale=[0.9, 0.9, 0.8, 0.9, 0.9, 0.9], coulomb_scale=[0.8, 0.8, 0.7, 0.8, 0.8, 0.8]

Filter : q_smooth[j] = (1-alpha)q_smooth[j] + alphaq_target[j] at 500Hz. Smooths 50Hz step inputs to avoid torque spikes.( Ref: [Question] Explanation of Sim2Real for universal robot as explained in blog · Issue #3815 · isaac-sim/IsaacLab · GitHub )


VLA Results (Real Robot)

# Picks up USB? Transport Final dist to target Failure mode
1 No N/A 393mm V1 stiction: robot cannot move, Y drifts
2 No N/A 340mm V1 stiction: Z rises instead of descending
3 YES (chunk 155) Drifts to x=529mm 294mm Singularity, wrist3 spins
4 YES (chunk 221) Drifts to x=505mm 358mm Singularity, wrist3 spins
5 YES (chunk 185) Partial 95mm Z stuck at 164mm (target 70mm)

Additional runs with Controller 5 gains but modified:

Kp Kd alpha Result
500 90 0.10 FAIL
900 120 0.05 FAIL: too sluggish, Y drifts +93mm
900 90 0.10 FAIL: Y drifts +75mm

The Problem: Systematic Y+ Drift

With servoJ (gain=300, lookahead=0.1, dt=0.02) at 50Hz, the task succeeds reliably. With direct_torque PD, TCP drifts monotonically in Y+ and little oscillation:

Chunk   1: TCP_Y = 330mm (correct, USB at Y=332mm)
Chunk  40: TCP_Y = 355mm (+25mm)
Chunk  80: TCP_Y = 393mm (+63mm)
Chunk 133: SAFETY STOP at Y=420mm

The VLA is closed-loop (vision-based). Small tracking errors cause it to output wrong corrections that compound.


Working servoJ Call (for reference)

rtde_c.servoJ(target_joints, 0, 0, 0.02, 0.1, 300)

Called at 50Hz. No filter, no smoothing. Works perfectly for entire task.


Full URScript (Controller 5, better result)

def pd_controller():
  Kp = [900.0, 900.0, 900.0, 500.0, 500.0, 500.0]
  Kd = [90.0, 90.0, 90.0, 60.0, 60.0, 60.0]
  alpha = 0.05
  max_torque = [150.0, 150.0, 150.0, 28.0, 28.0, 28.0]
  viscous = [0.9, 0.9, 0.8, 0.9, 0.9, 0.9]
  coulomb = [0.8, 0.8, 0.7, 0.8, 0.8, 0.8]

  q_target = get_actual_joint_positions()
  q_smooth = get_actual_joint_positions()
  running = 1

  while running == 1:
    q_target[0] = read_input_float_register(18)
    q_target[1] = read_input_float_register(19)
    q_target[2] = read_input_float_register(20)
    q_target[3] = read_input_float_register(21)
    q_target[4] = read_input_float_register(22)
    q_target[5] = read_input_integer_register(19) / 1000000.0
    if read_input_integer_register(18) == 1:
      running = 0
    end

    jj = 0
    while jj < 6:
      q_smooth[jj] = (1.0 - alpha) * q_smooth[jj] + alpha * q_target[jj]
      jj = jj + 1
    end

    q = get_actual_joint_positions()
    qd = get_actual_joint_speeds()

    tau = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
    tau[0] = Kp[0]*(q_smooth[0]-q[0]) - Kd[0]*qd[0]
    tau[1] = Kp[1]*(q_smooth[1]-q[1]) - Kd[1]*qd[1]
    tau[2] = Kp[2]*(q_smooth[2]-q[2]) - Kd[2]*qd[2]
    tau[3] = Kp[3]*(q_smooth[3]-q[3]) - Kd[3]*qd[3]
    tau[4] = Kp[4]*(q_smooth[4]-q[4]) - Kd[4]*qd[4]
    tau[5] = Kp[5]*(q_smooth[5]-q[5]) - Kd[5]*qd[5]

    jj = 0
    while jj < 6:
      if tau[jj] > max_torque[jj]:
        tau[jj] = max_torque[jj]
      elif tau[jj] < -max_torque[jj]:
        tau[jj] = -max_torque[jj]
      end
      jj = jj + 1
    end

    direct_torque(tau, viscous_scale=viscous, coulomb_scale=coulomb)
  end
  stopj(2.0)
end
pd_controller()


Happy to test any suggested changes immediately.