Artificial Intelligence Studies
We are developing an artificial intelligence (AI) capability to our platform in order for learners to get immediate feedback on their performance. This AI technology is novel in that the processing servers are remote, meaning that the learner does not have to have high bandwidth of internet.
Pilot Study, Ectopic Pregnancy
- Surgical novices and experts from Ethiopia, Cameroon, Kenya, and the United States contributed 36 laparoscopic salpingostomy simulation videos.
- 23 videos were graded by participants using a modified OSATS (Objective Structured Assessment of Technical Skills).
- Augmentation was performed on these 23 videos by adding noise to extracted data to increase the sample size to 112.
- Decided to focus on the Modified OSATs portion of the VOP
- Each video frame produced an output binary map (0=background, 1=tool). Tool positions were used to create vectors.
Grading of OSATS Characteristics via AI
- We measured three different mathematical variables/correlates associated with time and movement of the tools.
|Mathematic Correlate / Time-space variables||Expert surgeon|
|Path length over total time||Decreased|
|Standard deviation of tool position||Decreased|
- Path length is the total distance covered by right and left instruments over time. The less the path length- or less movements- the more skilled the operator.
- For example, in the first photo on the left below, you see the left hand measurement of an expert, with the x coordinate in blue, and y coordinate in yellow. You can see the overall fluidity and smoothness of movements. Compare this to a beginning level resident (2nd pane)- you can see the excess awkward movements and jerkiness. And then you can see the med student.
- The second variable measured was standard deviation of tool position. The lower the variance in the tool’s position, the more skilled the practitioner is.
- The third variable is the log of dimensionless jerk, which is the smoothness metric for the given speed profile. Experts have less jerking.
- These three mathematical dimensions were averaged for each video and combined into a 6-dimensional vector (right and left hand for each of the 3 different dimensions).
- Using AI, we trained a classifier that would analyze which combination of variables best predicted human scoring for each OSATS question.
Comparison of Human vs. AI
- We performed 50/50 random split of our synthetic data- half was used to train the classifiers (see above), and the other half was our validation cohort- or the group used to test our AI.
- You can see Figure 1 a sample of 69 videos from our synthetic sample comparing AI scores on a 5-point likert scale (in orange) to the human evaluations (in blue). As you can see there is concordance between human and AI.
- Figure 2 shows the overall performance but also the three domains individually. You can see that ~70% of videos had a complete match in likert scale scores between human and AI. This approached 80% in the flow of operation criteria of OSATS. The other 30% were off by 1 or more in the 5 point likert scale comparing human vs. AI.
- Highest mean accuracy per question was flow of operations (78%) followed by economy of time and motion (73%), instrument handling (72.4%), and overall performance (70.3%).
- Video review of global characteristics using AI was similar to that of human review in our laparoscopic training system.
- In LMICs where expert laparoscopic surgeons are often non-existent and thus direct teaching is limited, machine learning may help fill this educational gap.
- As more video data is collected, we will further optimize the accuracy and reliability of the AI algorithm.