Publications

Summarize with Caution: Comparing Global Feature Attributions. IEEE Data Engineering Bulletin on Responsible AI and Human-AI Interaction, 2021.

PDF Code Project Link

Efficient and Explainable Risk Assessments for Imminent Dementia in an Aging Cohort Study. IEEE Journal of Biomedical and Health Informatics (Special Issue on Explainable AI for Clinical and Population Health Informatics), 2021.
*, ** indicate equal contribution and authors are listed alphabetically

PDF Code Project Link

Opportunities for Bayesian Network Learning in Personal Informatics Tools. CHI 2020 Workshop on Artificial Intelligence for HCI: A Modern Approach, 2020.

PDF Project Link

Selected Projects

Many people, especially those with chronic health conditions such as irritable bowel syndrome and chronic migraine, track health related variables with the hope of discovering connections between causes and symptoms and enabling more informed choices regarding their health. Gaining actionable insights from this kind of data is known to be difficult because current tools often fail to automatically analyze data in scientifically rigorous, helpful, and actionable ways. Additionally, current tools often fail to account for lapses in tracking, evolving health goals, and the growing burden of tracking more information. This tool uses Bayesian network analysis framework to support individuals in analyzing their personal health data in evolving real life contexts.

Local interpretability methods are widely used because of their ability to generate explanations tailored to individual data points even for complex black-box models. Although these methods are not designed to provide a global view of a model’s behavior, many common interpretability tools offer makeshift global feature attributions obtained by taking the mean absolute value of each feature’s (local) attribution scores across all training data points and then ranking the features by their average scores. We argue that averaging feature attribution scores may not always be appropriate and explore the ramifications of doing so. We present an artifact-based interview study intended to investigate whether ML developers would benefit from being able to compare and contrast different global feature attributions obtained by ranking features by other summary statistics of their attribution scores. We find that participants are able to use these global feature attributions to achieve different tasks and objectives. Viewing multiple global feature attributions increased participants’ uncertainty in their understanding of the underlying model as they became more aware of the intricacies of the model’s behavior. However, participants expressed concerns about the time it would take to compare and contrast different global feature attributions, echoing observations from prior work about the need to balance the benefits of thinking fast and thinking slow when designing interpretability tools. This project was started during an internship with with Jenn Wortman Vaughan and Hanna Wallach in the FATE group at Microsoft Research.

As the aging US population grows, scalable approaches are needed to identify individuals at risk for dementia. Common prediction tools have limited predictive value, involve expensive neuroimaging, or require extensive and repeated cognitive testing. None of these approaches scale to the sizable aging population who do not receive routine clinical assessments. Our study seeks a tractable and widely administrable set of metrics that can accurately predict imminent (i.e., within three years) dementia onset. To this end, we develop and apply a machine learning (ML) model to an aging cohort study with an extensive set of longitudinal clinical variables to highlight at-risk individuals with better accuracy than standard rudimentary approaches. Next, we reduce the burden needed to achieve accurate risk assessments for those deemed at risk by (1) predicting when consecutive clinical visits may be unnecessary, and (2) selecting a subset of highly predictive cognitive tests. Finally, we demonstrate that our method successfully provides individualized prediction explanations that retain non-linear feature effects present in the data. Our final model, which uses only four cognitive tests (less than 20 minutes to administer) collected in a single visit, affords predictive performance comparable to a standard 100-minute neuropsychological battery and personalized risk explanations. Our approach shows the potential for an efficient tool for screening and explaining dementia risk in the general aging population.

A customizable surgical anesthesia monitor using D3 based on needs identified by interviews with doctors. Displays real time waveform vitals data, along with past vitals data trends. Allows users to explore past and present waveforms side by side, customize which waveforms and trends they want to see, and automatically calculates vitals statistics not available in current displays.

Contact

  • amokeson [at] gmail [dot] com