Attention YOLACT++: achieving robust and real-time medical instrument segmentation in endoscopic procedures.
Ángeles Cerón, Juan Carlos
MetadataShow full item record
Image-based tracking of laparoscopic instruments via instance segmentation plays a fundamental role in computer and robotic-assisted surgeries by aiding surgical navigation and increasing patient safety. Despite its crucial role in minimally invasive surgeries, accurate tracking of surgical instruments is a challenging task to achieve because of two main reasons 1) complex surgical environment, and 2) lack of model designs with both high accuracy and speed. Previous attempts in the field have prioritized robust performance over real-time speed rendering them unfeasible for live clinical applications. In this thesis, we propose the use of attention mechanisms to significantly improve the recognition capabilities of YOLACT++, a lightweight single-stage instance segmentation architecture, which we target at medical instrument segmentation. To further improve the performance of the model, we also investigated the use of custom data augmentation, and anchor optimization via a differential evolution search algorithm. Furthermore, we investigate the effect of multi-scale feature aggregation strategies in the architecture. We perform ablation studies with Convolutional Block Attention and Criss-cross Attention modules at different stages in the network to determine an optimal configuration. Our proposed model CBAM-Full + Aug + Anch drastically outperforms the previous state-of-the art in commonly used robustness metrics in medical segmentation, achieving 0.435 MI_DSC and 0.471 MI_NSD while running at 69 fps, which is more than 12 points more robust in both metrics and 14 times faster than the previous best model. To our knowledge, this is the first work that explicitly focuses on both real-time performance and improved robustness.