Diffusion Stabilizer Policy for Automated Surgical Robot Manipulations

UM-SJTU Joint Institute, Shanghai Jiao Tong University The Chinese University of Hong Kong Shanghai Jiao Tong University
*Equal contribution
Overview of Diffusion Stabilizer Policy showing perturbed demonstrations during training and stable trajectories during inference

Diffusion Stabilizer Policy first learns on clean demonstrations, then filters mixed clean and perturbed data with action prediction error so the final policy produces more stable surgical manipulation trajectories.

Introduction Video

Abstract

Intelligent surgical robots have the potential to revolutionize clinical practice by enabling more precise and automated surgical procedures. However, automation for surgical tasks remains under-explored compared with recent progress on household manipulation. To extend modern policy learning methods to surgical robotics, this work proposes Diffusion Stabilizer Policy (DSP), a diffusion-based policy learning framework that enables training with imperfect, perturbed, or even failed trajectories. DSP first trains a diffusion stabilizer policy using only clean data, then continuously updates the policy with a mixture of clean and perturbed data filtered by action prediction error. Experiments in both simulation and real-world settings demonstrate superior performance under different perturbation types.

BibTeX

@misc{ho2026diffusionstabilizerpolicyautomated,
      title={Diffusion Stabilizer Policy for Automated Surgical Robot Manipulations}, 
      author={Chonlam Ho and Jianshu Hu and Lei Song and Hesheng Wang and Qi Dou and Yutong Ban},
      year={2026},
      eprint={2503.01252},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2503.01252}, 
}