I had the great pleasure of going to the 56th annual meeting of the Society of Thoracic Surgeons (STS) in New Orleans, Louisiana to present my research “Cardiothoracic Surgeons Who Stay on as Faculty at Their Training Institution Demonstrate Greater Academic Productivity Over Their Careers” as an ePoster. With a team at the Stanford Cardiothoracic Surgery Department, we made a database of all academic cardiothoracic surgeons in the US and found that institutional continuity from one’s training to the first job is correlated with a greater number of publications, those publications having a higher impact, and a greater likelihood of achieving NIH funding. The implication of our study was that institutions should further enhance mentorship networks and academic start-up resources to support these first-time faculty arriving from different institutions. In this first part of a two-part series, however, I want to discuss some of the sessions I attended at STS 2020.

 

I attended a session titled “Machine Learning in Prediction of Cardiothoracic Surgery Outcomes” where they discussed the basis of machine learning in CT surgery before diving into a couple of specific examples. Machine learning is defined as the use of algorithms or statistical models to perform a specific task without instructions or, in other words, the use of patterns to make inferences. A machine-learning algorithm could, for example, predict a patient’s future health trajectory. Doctors do that already on the basis of their experience and training, but algorithms can systematize this process and gain greater insights by analyzing a much larger population than doctors will ever see in their lifetime. Furthermore, with the enormous computational power at their disposal, machine learning software runs on milliseconds, making them much faster than a doctor can ever be.

Typically, doctors want a binary outcome (yes or no): will the patient die after cardiac surgery, will the patient be rehospitalized within the next month, will the patient have an adverse effect due to the treatment regimen? Algorithms, however, do not give a binary outcome, instead of providing a probability of an event occurring. One could translate the probability to a binary classification where, if P(Y = 1) ≥ t, then the algorithm predicts the patient will die; if P(Y = 1) < t, then the algorithm predicts the patient will live. You could set t to 50% for instance. Nonetheless, you could also make t very large so that it rarely predicts mortality and is really good for patients who have more pessimistic prognoses. Alternatively, you could make t very small so that it predicts mortality more often and detects all patients who might be at risk.

Logistic regression, a standard statistical model used to determine the relationship between numerous variables, is actually a type of machine learning. The problem with a logistic regression model, however, is that it is unable to capture non-linear relationships and unrealistically assumes each independent variable affects the risk in isolation with the rest. A tree-based machine-learning algorithm, on the other hand, makes decision rules that account for independent variables affecting one another. That being said, tree-based algorithms can take quite a while to train, need large amounts of data, and require a diverse data set for the algorithm to be of quality. Another type of machine learning, ensemble algorithms make a large group of the aforementioned trees with each tree “voting” on the predicted outcome of every observation to generate the official prediction. The tradeoff here is that, while you get greater accuracy, you lose interpretability as it becomes harder and harder to determine why the algorithm makes the specific decision that it does. The last major type of machine learning algorithms are artificial neural networks, which take in information as an input, put it through a function, and then give out an output, similar to biological neural networks. The problem here is that not only do these artificial neural networks require a great deal of computational power, but they also are black boxes that make it impossible to know the rationale behind the algorithm’s decision. The black box problem has great ethical implications: who should be held liable if the algorithm gives the wrong decision? Furthermore, there is no way to know if the algorithm is going to make the wrong decision again because nobody knows why it made the wrong decision in the first place. In the same way that we do not fully understand how our brain works, we do not fully understand how artificial neural networks work.

This all being said, the future of machine learning is probably not to replace expert judgment but to improve risk prediction and make better decisions. Furthermore, the effectiveness of machine learning over traditional logistic regression risk models varies from problem to problem, and there have not been many studies showing if machine learning was even able to change behavior or outcomes. Even the benefits shown by researchers presenting their use of machine learning for surgical aortic valve replacement (SAVR) and coronary artery bypass graft (CABG) was only modest at best. Machine learning has great potential, but its future may be slightly overhyped.

Pros and Cons of Various Machine Learning Algorithms

 

I also attended a session called “Cardiothoracic Surgery Education and Professional Development” where my mentor was presenting an abstract that found that surgeons who do their general surgery residency at an institution with a CT surgery fellowship program exhibit greater academic productivity and show faster career advancement. This finding may be evidence of the importance of early CT exposure in creating successful academic CT surgeons. A Stanford medical student who was also on our team presented on the Women in Thoracic Surgery (WTS) scholarship, showing that, at every level of training, the award was associated with the greater successful pursuit of CT surgery than contemporaries.

There was a multitude of other interesting abstracts presented at that session as well. Dr. Jessica Luc conducted a randomized controlled trial to determine the effect of tweeting on an article’s citations and Altmetric score, which is a composite of social media attention. She found that tweeting an article results in a significantly greater number of citations as well as Altmetric score after a year. This finding may have occurred because the social media promotion of an article may help improve the article’s memorability among authors. In another interesting presentation, Dr. Oliver Chow conducted a national survey of burnout among US CT surgery trainees and found some distressing results. 531 CT surgery trainees responded to his survey from 76 different programs, and 40% screened positively for indicators of depression. 26% would not do CT surgery again if given the choice, and about 50% were dissatisfied with the balance in their professional life. Considering that burnout is linked to poor job satisfaction, increased medical errors, attrition, and physician suicides, the results of this survey indicate that we have to do more to help CT surgery residents deal with their rigorous workload and clear the stigma for asking for help.

Tweeting Predicts Citations in CT Surgery

 

Attending STS was a spectacular experience for me to learn more about medicine at the highest level through informative sessions filled with impactful presentations. It was such a privilege to be able to attend this national surgical conference as a high school student, and I truly enjoyed every moment of it.