We propose a probabilistic shared-control solution for navigation, called Robot Trajectron V2 (RT-V2), that enables accurate intent prediction and safe, effective assistance in human-robot interaction. RT-V2 jointly models a user's long-term behavioral patterns and their noisy, low-dimensional control signals by combining a prior intent model with a posterior update that accounts for real-time user input and environmental context. The prior captures the multimodal and history-dependent nature of user intent using recurrent neural networks and conditional variational autoencoders, while the posterior integrates this with uncertain user commands to infer desired actions. We conduct extensive experiments to validate RT-V2 across synthetic benchmarks, human-computer interaction studies with keyboard input, and brain-machine interface experiments with non-human primates. Results show that RT-V2 outperforms the state of the art in intent estimation, provides safe and efficient navigation support, and adequately balances user autonomy with assistive intervention. By unifying probabilistic modeling, reinforcement learning, and safe optimization, RT-V2 offers a principled and generalizable approach to shared control for diverse assistive technologies.
Sim-Exp Setup.
Performance on objective and subjective metrics. (a) Total keyboard input. (b) Completion time. (c) Trajectory length. (d) The agreement survey. (e) NASA-TLX survey.
Visualization of successful demonstrations of navigation in simulation. All the trajectories for three methods (Direct, HO+APF, and RT-V2) in 20 scenes are plotted. Green boxes, white dash-line boxes, and pink boxes are true goals, distractors, and obstacles, respectively. Light blue lines denote the trajectories of Direct, yellow lines denote the trajectories of HO+APF, and orange lines denote the trajectories of RT-V2.
Visualization of failure trials of RT-V2. Both trajectory and entropy are plotted. The upper row for each subfigure visualizes the goals (green boxes), obstacles (pink boxes), distractors (white dash-line boxes), trajectories (orange lines), and user commands (light blue arrow). We visualize the user commands every 4 iterations for better visibility. The lower row for each subfigure visualizes the upper bound (Entropy UB) and lower bound (Entropy LB) of the entropy of the action-GMMs generated by RT-V2 for each iteration. Besides, the opacity of the trajectories represents the normalized entropy lower bound.
Results of the BMI fixed-obstacle shared autonomy experiments. (a) Success rate. (b) Trajectory length. (c) Total iterations. (d) Completion time. (e) Scaled trajectory length. (f) Scaled total iterations. (g) Scaled completion time. *=p<0.05, **=p<0.01, ***=p<0.001, and ****=p<0.0001.