INTRODUCTION
Computer vision is a subset of machine learning (ML) that allows automated analysis of large operative video datasets. Laparoscopic cholecystectomy is a high-volume procedure with consistent steps suitable for the application of ML techniques. Recent advances have included automated identification of operative steps and anatomical structures, but the impact of these technologies has been confined to research studies[1,2]. Their use in clinical practice has been limited due to a lack of surgeon awareness of the potential applications, concerns regarding the black box nature of algorithms, and limited high-quality surgical video data sets. Given the significant barriers to entry in developing these systems, including computer science expertise and data requirements, it is possible the commercial versions of these tools will become increasingly widespread. In this context, surgeon-led consideration of how these tools add value in clinical practice is needed.
Traditionally, clinicians have used pre-operative variables to predict the degree of gallbladder inflammation and thus surgical difficulty[3]. Increasingly intraoperative grading scores have been shown to be associated with operative outcomes and technical difficulty[4-7]. Given outcomes are often related to actions taken intraoperatively, quantification of technical difficulty allows for operative benchmarking, prediction of post-operative outcomes, and development of research standards[8]. We hypothesize that an artificial intelligence platform can confirm the impact of a “difficult” cholecystectomy by assessing a subjective intra-operative cholecystectomy grading system. The aim of this study was to use a commercially available ML-powered surgical video management and analytics platform (Touch SurgeryTM) to evaluate subjective intraoperative grading of operative difficulty during laparoscopic cholecystectomy using a stepwise workflow approach and thereby consider the implications for clinical practice.
METHODS
Study Design
Patients undergoing elective laparoscopic cholecystectomy and routine operative cholangiogram (IOC) by a single specialist hepatobiliary surgeon (North Shore Private Hospital, Sydney, Australia) were consented preoperatively to undergo video recording of their operation. This study was approved by the Ramsay Health Care research ethics committee (approval no. RG2020.153). Video footage from camera insertion to the removal of the specimen was captured as part of routine patient care with an intraoperative photo of the critical view of safety (CVS) taken in every operation and measured operative time excluded time setting up the equipment, establishing the pneumoperitoneum and closing the wounds. Laparoscopic Cholecystectomy procedures were recorded, saved, de-identified, and then uploaded to Touch SurgeryTM (https://www.touchsurgery.com/professional), a web-based platform for surgical video storage and surgical analytics, powered by ML. Upon upload, all videos were run through the Touch Surgery RedactORTM algorithm to ensure any remaining patient identifiable information was removed. RedactORTM detects portions of the video where the camera is outside of the patient and pixelates the video stream in real-time on upload to prevent the recording of any potentially identifiable information. Operations are automatically broken down into phases and steps to provide insights into surgical performance, variation, and standardization, which provides opportunities for pre-operative rehearsal and post-operative review. The underpinning ML is based on Convolutional Neural Networks architectures for classifying and extracting frames into their feature representation (step one). A single frame, however, is normally not sufficient to correctly identify the operative phase, as it may depict anatomical landmarks that appear throughout the operation. To overcome this limitation and process the temporal information together with the spatial information, these features are then fed into a Recurrent Neural Network (step two) to improve temporal consistency and representation[9,10]. Touch SurgeryTM phase identification is based on previous works including DeepPhase, EndoNet, and a phase recognition model with an F1 (a composite score used to assess ML accuracy generated by taking the mean of the positive predictive value and sensitivity) score of 91.1% in predicting phase of total knee joint replacement[9-11]. The network used to annotate Laparoscopic Cholecystectomy videos in this paper was developed by Digital Surgery Ltd. (UK) using a large dataset of combined videos from surgeons of different countries and hospitals. It achieves 95% accuracy in detecting phase transitions in laparoscopic cholecystectomy. When tested on the video data included in this study, the model also achieved 95% accuracy. Qualified annotators, trained on surgically-validated guidelines, quality-assured the model outputs.
Operative Phases
In the present platform, Touch SurgeryTM defined the surgical workflow phases for the automated analysis by liaising with key opinion leaders and consulting the literature[12-15]. Based on this, the laparoscopic cholecystectomy videos were divided into the following five operative phases for the purposes of automated analysis:
1 Port insertion and gallbladder exposure.
2 Dissection of Calot’s triangle.
3 Ligation and division of the cystic duct & artery.
4 Gallbladder dissection.
5 Specimen removal and closure.
CVS
Presence of the CVS was manually documented as part of the Touch SurgeryTM digital analytics service by trained annotators in accordance with the SAGES safe cholecystectomy program[16]. This approach has previously shown validity, with Deal et al.[17] demonstrating a statistically significant correlation between expert and crowd workers’ ratings of CVS achievement.
Grading of Operative Difficulty
The North Shore system uses a 4-point “operative difficulty” grading score which has been recorded prospectively in the operation record for every patient since 1998. This was modified from an earlier grading system first described by Hugh et al. in 1992 in an unselected consecutive series of 100 patients undergoing laparoscopic cholecystectomy[5,18]. Assessment of the intraoperative findings was performed and documented at the commencement of the procedure by the attending surgeon in keeping with the scale as described by O’Neill et al.[Figure 1][5,18].
Inclusion and exclusion criteria
The present cohort includes both elective and acute patients presenting to a single surgeon HPB surgeon at the Royal North Shore Hospital and North Shore Private Hospital, St Leonards, NSW, Australia. To be eligible, captured videos had to have all phases including port insertion, dissection of Calot’s triangle, ligation and division of the cystic duct and artery, gallbladder dissection, and specimen removal. Videos that did not have all five phases due to late recording or early stopping were excluded from the analysis.
Statistical analysis
Statistical analysis was performed using SciPy and Pingouin[19,20]. D’Agostino-Pearson’s test of normality was performed; where there was a normal distribution, a Levene test of variance was performed, or if non-parametric, Bartlett’s test. Mann-Whitney U tests were performed for non-parametric samples with equal variance and Brunner-Munzel for those with unequal variance. For parametric samples with equal variance, a t-test was performed, or Welsh’s test for those with unequal variance.
RESULTS
During the study period, 233 patients consented to the video recording of their procedures, and from this group, 206 (88%) videos met the inclusion criteria. 27 LC were excluded due to incomplete video recording and were therefore not amenable to analysis. The videos analyzed included a consecutive series of patients operated on by a single surgeon over a 3-year period. Most operations were done electively, and in all cases, a standardized operative approach including routine intra-operative cholangiography was undertaken. Demographic and peri-operative details of the cohort are seen in Table 1.
The median operative time was 19min and 53s (IQR 15min and 53s-26min and 16s). In total, 143 (69%) patients were classified as either grade 1 or 2, with a median operative time of 17min and 53s (IQR 15min and 24s-21min and 38s). In comparison, 63 (31%) patients were classified as either grade 3 or 4 with a median operative time of 25 min and 49s (IQR 20min and 12s-38min and 38s). Operative time was significantly shorter for grade 1 and 2 than for the patients graded 3 or 4 (P < 0.01) [Figure 2]. The variation in operative length was greatest in patients who were assigned a grade of 3 or 4. The time differences and P-values between phases are documented in Table 2.
There were 33 (16%) grade 1 patients, with a median operative time of 15min and 49s (IQR 13min and 14s-18min and 15s), and 110 (54%) grade 2 patients, with a median operative time of 18 min and 25s (IQR 15min and 45s-21min and 51s). Fifty-two (25%) grade 3 patients were analyzed with a median operative time of 23min and 48s (IQR 19min and 56s-33min and 34s), and 11 (5%) grade 4 patients’ videos were analyzed, with a median operative time of 56min and 4s (IQR 41min and 18s-71min and 11s).
When the operations were analyzed according to the five predetermined operative steps, all phases took significantly longer to complete in grade 3 and 4 patients compared with grade 1 and 2 patients [Table 2] [Figure 3].
The rate of achievement of the CVS for each operative grade is shown in Table 3. The rate of achievement of the CVS when comparing grade 1-2 and grade 3-4 was not significantly different (P = 0.177)
DISCUSSION
The ML-powered system allowed automated analysis of a large video dataset, confirming that the total operative time and individual operative phases were correlated with an intraoperative difficulty rating. Operative time is a consistent marker of technical ability and operative difficulty across the surgical literature, and grading of laparoscopic cholecystectomy difficulty has been shown to have validity in predicting outcome[4-6,8,21-24]. This study provides an example of the emerging clinical utility of computer vision technology in providing automated operative analytics in clinical practice.
Accurate identification of the operative phase is important in allowing workflow planning and the development of intraoperative decision support systems. However, to have utility, operative phases need to be clinically relevant. While previous publications have considered the accuracy of automated phase identification, there is currently no universal standard in laparoscopic cholecystectomy[25]. The present study investigated the clinical utility of automated phase identification by considering the impact of a subjective gradings score on operative phase times. A significant difference in phases times was seen across all phase times when comparing grade 1 and 2 gallbladders with grade 3 and 4 gallbladders. The major time difference between grades was seen in the time taken in initial exposure and the time to dissect Calot’s triangle, which is arguably the most critical step in avoiding a bile duct injury. The image findings of the IOC were not captured as part of the laparoscopic recording, which meant this could not be included as a discrete phase in this study; however, routine performance ensured there was no biasing effect between groups. While further work is needed to create a unified standard of phase identification, the data presented here suggest clinical utility of the chosen phases.
Achievement of the CVS is an established requirement in safe cholecystectomy[16,26]. Rates of CVS achievement are often overstated, with one study finding CVS was only achieved in 10.8% of patients despite a documented achievement rate of 80%[27]. Intraoperative photo documentation of the CVS has been suggested as a quality control measure; however, this is surgeon dependent and necessitates subsequent external audit to ensure consistency[28]. In contrast, routine intraoperative video recording removes barriers to capture and may ensure consistency of achievement[29]. The high rate of CVS achievement in the current study (88%) is in keeping with operations being performed in the elective setting by a sub-specialist hepatobiliary surgeon. The inverse relationship between patient grade and CVS achievement demonstrated is concordant with an accurate grading score. Broader validation could allow for a benchmark rate of CVS achievement, prompting audit and review if rates persistently drop below this. While in the future, a prospective analysis could provide intraoperative prompts with manual override to ensure the CVS is achieved.
Surgical curricula are increasingly relying on competency-based models as a means of capturing progress[30-32]. This approach reflects the operative learning curve, in which trainees perform different segments of each operation under supervision before progressing to perform the entirety of the operation. By creating agreed phases or steps of each operation as part of a training curriculum, these competencies can be captured, and accurate feedback provided. Capture and automated assessment of these phases with ML techniques is a logical step in this pathway. While manual review of large volumes of video is not feasible, employing AI allows automated analysis and segmentation of phases. This study provides timeframes for each stage of the operation that represents a technical gold standard as the operations were performed by an experienced laparoscopic hepatobiliary surgeon. Although further data is needed for each level of trainee and each grade of gallbladder difficulty, this forms the first part of establishing competency-based standards for a surgical procedure. In the future, failure to meet expected time requirements might trigger a manual review of the technique with surgeon mentors. Prospective capture with automated grading and analysis could allow for focused video review between surgeon and trainee. Routine operative difficulty grading would quantify the operative technical difficulty of the procedures trainees are undertaking. Given the operative technical skill and the operative difficulty grade are predictive of patient outcomes, both need to be taken into account when considering trainee progress[4,5,8]. Understanding the degree of difficulty of the operations the trainee is undertaking and what phases of these are challenging would more accurately quantify the trainees’ progression through their learning curve.
Given the documented utility of the classification system for quantifying the difficulty of laparoscopic cholecystectomy in both classical and ML evaluations, validation of clinical usefulness needs to be confirmed in a large cohort of surgeons at different operative levels. This would allow for the generation of normal curves for expected operating time for each phase of the identified operation. The novel test set from this study could potentially be used to develop automated identification of the intraoperative difficulty grade.
The present study focused on overall and phase timing as measures of operative difficulty as a means of considering the clinical utility of the computer vision platform. Time is only one aspect of operative performance that can be assessed using ML techniques. In particular, automated assessment of CVS attainment would represent a significant advancement. Other factors that could be captured automatically include the rate of gallstone spillage, the number of instrument changes, and the economy of instrument movement. Incorporating these and other factors in automated analysis could produce a more comprehensive assessment of operative techniques for both audit and training purposes.
AI models are able to segment and automatically identify critical operative steps1. However, in most cases, this has involved retrospective capture and analysis of video in relatively small sample sizes, and this approach is limited by the physical time cost required for surgeon video labeling[17]. Through pooled data sets, increased surgeon interest, and possibly unsupervised ML, these issues are slowly being addressed. It is even possible to envisage that soon the operative video will be stored as part of the patient notes and with an automated operative note generation. As these difficulties are overcome, and AI tools become readily available in the workplace, clinician involvement with decision-making regarding utility, utilization, and value will be needed. Engagement ensures the tools developed will be driven by clinical applicability and provide value in patient care rather than an externally imposed quality indicator adding to the already burgeoning paperwork load.
Computer vision tools lack easy explainability due to the opaque nature of the internal logic of their underpinning neural networks algorithms, limiting clinicians’ ability to understand and explain how these tools reach their conclusions. This concern has been particularly pronounced when these tools are used to guide treatment decisions. Where the inability to explain fully how a decision is reached precludes a clinician’s ability to undertake informed consent with their patients[33]. However, the recent federal drug administration approval of the GI Genius system for automated polyp identification following clinical trial data showing increased adenoma detection rate signifies the increasing acceptability of these systems where they are clinically explainable and improve outcomes[34,35]. The current retrospective nature of surgical video analysis platforms means that they do not directly impact decision-making around patient treatment and therefore do not violate the principles of informed consent due to a lack of algorithmic explainability. While this lessens the ethical barrier to uptake, it is still imperative for clinicians to consider how they should be used in clinical practice and if outputs are consistent with clinical intuitions. Clinician input is therefore needed to link these systems to clinical practice and consider if their results have clinical explainability. In particular, while phase identification algorithms in laparoscopic cholecystectomy have shown reasonable accuracy, their consistency with real-life clinical intuition needs to be considered. In this context, the association seen between increasing operative time and increasing operative difficulty, particularly in the dissection of calot’s triangle, is consistent with clinical intuition and clinically explainable.
The study presents a single specialist surgeon cohort of prospectively captured laparoscopic cholecystectomy operations. While the universality of laparoscopic cholecystectomy means that from a technical perspective, this study is generalizable, this may not be true for the ML analysis. This is because these systems can be brittle with significant changes in analysis quality due to seemingly irrelevant changes in operative approaches or equipment[36]. It should also be noted that the operative times cannot be extrapolated due to the procedures being undertaken by a single expert HPB surgeon. Further validation of intraoperative grading is needed in external data sets encompassing a broader number of centers. ML in surgery is a nascent field, but this study and others like it demonstrate the potential in operative analytics, documentation, audit and training of future surgeons.
DECLARATIONS
Acknowledgements
With thanks to Touch SurgeryTM for their access to the platform, advice and provision of output data throughout this study.
Authors Contributions
Made substantial contibutions to the conception, acquisition, analysis of the data, drafiting and revision of this work: Tranter-Entwistle I, Eglinton T, Connor S, Hugh TJ.
Availabiity of Data and Research Materials
Data could be provided on reasonable request.
Financial support and sponsorship
None.
Conflicts of interest
Isaac Tranter-Entwistle has received funding from Medtronic (Touch SurgeryTM is a subsidiary of Medtronic) to undertake a PhD through the University of Otago from February 2021. This study was conducted in 2020.
Thomas Hugh has undertaken consultancy for Touch SurgeryTM separate to the present study and was not involved in the data collection or analysis of results in this study.
Saxon Connor has undertaken pro bono consultancy for Medtronic/Touch SurgeryTM developing a freely available educational application around laparoscopic cholecystectomy, as well as video annotation as part of an unrelated study.
Tim Eglinton has no conflicts of interest to declare.
Ethical approval and consent to participate
This study was approved by the Ramsay Health Care research ethics committee (approval no. RG2020.153). Patients undergoing elective laparoscopic cholecystectomy and routine operative cholangiogram (IOC) by a single specialist hepatobiliary surgeon (North Shore Private Hospital, Sydney, Australia) were consented preoperatively to undergo video recording of their operation. All study participants provided informed consent.
Consent for publication
All videos were run through the Touch Surgery RedactORTM algorithm to ensure any remaining patient identifiable information was removed.
Copyright
© The Author(s) 2022.
Comments
Comments must be written in English. Spam, offensive content, impersonation, and private information will not be permitted. If any comment is reported and identified as inappropriate content by OAE staff, the comment will be removed without notice. If you have any queries or need any help, please contact us at support@oaepublish.com.