Carnegie Mellon University

Center for Machine Learning and Health

Incubating Digital Healthcare Startups

The Center for Machine Learning and Health (CMLH) at Carnegie Mellon University is one of two centers launched under the umbrella of the Pittsburgh Health Data Alliance, formed in 2015 to unite Carnegie Mellon’s unrivaled applied-computing capabilities, the University of Pittsburgh’s world-class health sciences research, and UPMC’s clinical care and commercialization expertise.

Our Focus Areas

The CMLH supports great science and engineering that can lead to innovative health solutions and new businesses. The Center's three focus areas are:

CMLH improving outcomes icon image

Improving Outcomes

Connect and coordinate the health system to empower clinicians to provide high-quality care in any setting.

CMLH consumer oriented healthcare icon image

Consumer-Oriented Healthcare

Develop solutions that allow consumers to access medical services and information anytime, anywhere, and to engage in all steps of the healthcare journey.

CMLH infrastructure and efficiencies icon image

Infrastructure and Efficiencies

Enhance resource allocation, service levels, and care pathways to coordinate and manage the cost of care.

2021 Funding Opportunities

From bench to bedside: the CMLH funds projects that strive to bridge the gap between research and practice. All funded work at CMLH will have a clear line of sight to commercial application. Although many CMLH projects will involve data analytics and machine learning, our approach is technology agnostic. We welcome proposals that involve human-computer interaction, language technologies, information systems, computer graphics, computer vision, artificial intelligence, robotics, electrical engineering, economics, psychology, sociology, public policy, business administration, law, design, and any other disciplines that apply to healthcare.

CMLH initially provides research projects with approximately one year of funding. After one year, projects may attract more funding to refine the technology and/or its development for commercialization.

In addition to offering funding, CMLH provides support for funded projects in the form of datasets, access to patients and doctors for empirical validation of new concepts, and entrepreneurial mentorship.

Awards are intended to support research that transforms healthcare from transactional and experience-centered to data-driven. Once selected for an award, the CMLH and UPMC Enterprises will help identify potential clinical and data-transaction partners and provide guidance related to commercialization activities, as needed.

Proposal submissions are open only to Carnegie Mellon faculty in all units of the university. All proposals should be submitted via email to:


Download an overview of our Center for Machine Learning and Health projects here:

Download the PDF

Please use the options below to sort ongoing and completed projects.

Scaling Remote Extracorporeal Membrane Oxygenation (ECMO) Training with Intelligent Agents

Project Summary

The success of surgical training relies on the close connection between instructors and trainees. In case-based simulations for extracorporeal membrane oxygenation (ECMO), for example, trainees perform tasks such as cannulation while instructors observe, guide, and correct. Such on-site training, however, does not scale to large numbers of trainees, and instructors must travel to training sites, limiting the availability of the training. Remote training using Augmented Reality and Virtual Reality has the potential to enable scalable and safe training. Current approaches, however, do not provide high quality training in a scalable manner.

In the proposed research, CMU’s team of David Lindlbauer, PhD, Assistant Professor at the Human-Computer Interaction Institute, and Jean Oh, PhD, Associate Research Professor at the Robotics Institute, in collaboration with Raj Ramanan, MD, Assistant Professor of Critical Care Medicine at UPMC and Medical Director UPMC Procirca ECMO Education, aims to improve the quality and scale of ECMO training by enabling a tight connection between instructors and trainees in a remote setting. This requires instructors to track which trainee requires assistance and who shows a high level of proficiency. To address this challenge, they propose to pair a novel virtual remote training environment with novel intelligent agents that observe, track, guide, and assist instructors and trainees during the training. As a result, instructors can deliver training to a larger number of trainees without sacrificing the quality of learning experience, enabling highly scalable remote training. The proposed work is a vital step toward training medical experts in a scalable and efficient way.

The team proposes to address the challenge of providing a scalable, immersive, and tightly connected training experience to instructors and trainees. The technological innovation lies in combining a novel advanced remote training environment with an intelligent agent that supports instructors when working with dozens of trainees simultaneously. This work is the first to focus on the close connection between instructors and trainees, while providing ECMO training in a remote manner. This will allow for training more clinical professionals in ECMO and provide recurrent rather than one-time training.

The researchers see their work as a first step toward improving clinical training in general. The software of the remote training environment will enable them to incorporate other training scenarios such as the closely related extracorporeal cardiopulmonary resuscitation (ECPR) or other parts of clinical practice such as minimally invasive procedures, all simply by designing new environments.


David Lindlbauer, PhD
Assistant Professor at the Human-Computer Interaction Institute
Carnegie Mellon University

Jean Oh, PhD
Associate Research Professor at the Robotics Institute
Carnegie Mellon University

Raj Ramanan, MD
Assistant Professor of Critical Care Medicine at UPMC and Medical Director UPMC Procirca ECMO Education

Smart Sensors for Evaluating and Improving Cochlear Implants

Project Summary

Hearing loss is one of the leading causes of disability worldwide, responsible for an estimated global cost of almost $1 trillion annually. Hearing loss is also personally devastating, as it causes social isolation and is a large risk factor for dementia. When hearing loss progresses beyond the point where hearing aids are effective, acoustic hearing can be augmented with electronic hearing via a cochlear implant (CI). These life changing devices are increasingly seeing use amongst non-deaf patients who have residual acoustic hearing. In these cases, preserving this hearing during implantation yields better hearing outcomes by allowing simultaneous use of a hearing aid and CI. In addition, the acoustic hearing can be lifesaving in emergency situations, such as hearing a fire alarm while the implant is turned off.

Currently, suboptimal cochlear implantation results in 10-15 dB reduction of acoustic hearing on average after the surgery, although some patients experience no detectable loss in hearing due to implantation and others are left completely deaf. In all, only 40% of patients have any residual hearing preservation after surgery. The substantial risk posed to residual hearing due to the implantation limits the eligible population for these devices and is a major barrier to adoption amongst the current eligible population.

Carnegie Mellon University’s Maysam Chamanzar, PhD, Professor of Electrical and Computer Engineering and Jay Reddy, Postdoctoral Research Associate of Electrical and Computer Engineering are working on this project. They propose to integrate smart sensors onto the CI electrode arrays, for the first time, to provide real-time quantitative feedback during implantation surgery. Adjusting the insertion technique to minimize these quantitative metrics will aid residual hearing preservation. This will, in turn, expand the eligible patient population for CIs and combat hearing loss by reducing the risk of the implantation procedure. The long-term goal of this project is to develop a novel smart sensor array that attaches to current CI electrode array designs to give feedback to the surgeon during implantation, resulting in better hearing for more people.


Maysam Chamanzar, PhD
Professor of Electrical and Computer Engineering
Carnegie Mellon University
Jay Reddy
Postdoctoral Research Associate of Electrical and Computer Engineering
Carnegie Mellon University

Improving Breast Cancer Diagnosis with Interpretable Machine Learning

Project Summary

Our goal aims to increase the effectiveness of breast cancer screenings by using AI to enhance current processes for continually training radiologists using clinical data. To motivate one possible avenue to improve care, we note the well-documented and significant inconsistencies in how various doctors evaluate mammogram screenings. Recall rates (the fraction of patients asked to return for more targeted diagnostic testing) vary widely across doctors. In current quality initiatives, doctors periodically review past cases, often focusing on those that have been retrospectively confirmed to be miscalled. We believe that deep learning can enhance the selection process for review in the paraclinical setting, helping to increase cancer detection rates and reduce false positives, leading to a more effectively treated patient population.  

By using hundreds of thousands of mammogram screenings and modeling not only ground truth labels, but also the tendencies of each radiologist, our AI solution will identify ideal training cases for review, e.g., by selecting those instances where either specific doctors are predicted to disagree with the ground truth, or cases where specific doctors are predicted to disagree with each other. Our solution will utilize state-of-the-art computer vision techniques together with human-centered visualizations and design principles, to make the results of the AI interpretable to medical experts with limited AI expertise.


Adam Perer, PhD
Assistant Research Professor, Human-Computer Interaction Institute
Carnegie Mellon University

Zachary Chase Lipton, PhD
Assistant Professor of Operations Research and Machine Learning, Tepper School of Business
Carnegie Mellon University

Brain Tsunamis

Project Summary

The last decade of research has established that Cortical Spreading
Depolarizations (CSDs), or “Brain Tsunamis,” play an important role in many
disorders, including Traumatic Brain Injury (TBI), stroke, hemorrhage, and
migraine, that collectively affect more than a billion people worldwide and
are major causes of death and disability. CSDs are waves of neurochemical
changes that propagate across the brain surface and are thought to
mediate secondary brain damage following an injury or event.

The only reliable way to measure CSDs now is an invasive method using
electrodes; it is costly, risky, and inappropriate for patients with certain
contraindications. In addition, detection is slow and requires a physician to
visually inspect the data. Diagnosis and treatment in real-time thus is not
realistic in most clinical settings.

Our solution is a Noninvasive Platform for Automated CSD Detection
and Suppression. Our multidisciplinary team will develop algorithms and
techniques to noninvasively suppress CSDs by using online detection
and ensuing stimulation. This is the critical first step for discovery of
novel therapeutic strategies and would eliminate the need for invasive
monitoring. Real-time detection would allow rapidly iterated, individualized
care, timely intervention, and enhanced patient outcomes.


Pulkit Grover, PhD
Assistant Professor of Electrical
and Computer Engineering
Carnegie Mellon University

Shawn Kelly, PhD
Senior Systems Scientist, Institute for
Complex Engineered Systems
Carnegie Mellon University

Marlene Behrmann, PhD
Professor, Psychology
Carnegie Mellon University

Michael Tarr, PhD
Department Head and Professor, Psychology
Carnegie Mellon University

Jonathan Elmer, MD
Assistant Professor of Critical Care Medicine
and Emergency Medicine
University of Pittsburgh

Lori A. Shutter, MD
Vice Chair of Education, Professor of Critical
Care Medicine, Neurology and Neurosurgery
University of Pittsburgh

Clinical Genomics Modeling Platform

Project Summary

The Clinical Genomics Modeling Platform is an engine for easily building
precision-medicine models for various diseases and populations. Triage
algorithms, for instance, might help to determine if patients with a certain
disease should be sent home with monitoring or sent to the intensive
care unit.

This project focuses on the understanding of the relationship between
genetic variation and medical outcomes in a large population, which is key
to realizing the vision of personalized medicine. Efforts are now underway
to obtain the genome sequences of thousands of individuals. As cost of
sequencing continues to drop, it will become routine to sequence patients
in a medical setting. However, a number of computational and practical
challenges remain in the way of using genomic sequencing for clinical
decision making.

In this project, Dr. Carl Kingsford and Dr. Christopher
Langmead are aiming to increase the usage of predictive systems based
on machine learning techniques and genomics through the development
and commercialization of a computational system that will:

• Make predictive models easy to build for clinical researchers
• Make predictive models easy to share (sell) and apply
• Make results of models easy to understand

Project Status

The research team is continuing to carry out commercially orientated research on this topic.


Carl Kingsford, PhD
Herbert A. Simon Professor,
Computer Science
Carnegie Mellon University

Christopher Langmead, PhD
Associate Professor, Computational Biology
Carnegie Mellon University

Deep Learning for Placental Pathology

Project Summary

A collaborative team from CMU, Pitt, and UPMC is developing image
analysis software to examine slide images of placentas following
delivery and identify abnormalities that require review by a pathologist.
Abnormalities found in these images can provide critical information about
the health of the infant and mother.

Though the placenta receives a routine examination in the delivery room,
experienced pathologists – especially those with perinatal subspecialty
expertise – are not always available to handle the volume of incoming
placentas requiring examination. The software being developed in this
project thus would reduce the risk that physicians would miss detecting
otherwise avoidable complications.

The researchers will apply deep learning techniques to create an algorithm
that will support high-throughput examination of whole-slide images of
placentas and identification of placentas that most need a pathologist’s
attention. This approach will be enabled by using the digitized whole slide
imaging dataset of placenta slides at UPMC, together with the current
rich dataset available from the Magee Obstetric Maternal & Infant (MOMI)
Biobank of placenta images coupled with pathology metadata, clinical data
regarding the delivery, and long-term data about the infant’s health.

Project Status

The research team has published the results of their work. The paper, titled Decidual Vasculopathy Identification in Whole Slide Images Using Multi-Resolution Hierarchical Convolutional Neural Networks ” was published in The American Journal of Pathology.


Jonathan Cagan, PhD
George Tallman and Florence Barrett
Ladd Professor, Mechanical Engineering
Carnegie Mellon University

Philip R. LeDuc, PhD
William J. Brown Professor,
Mechanical Engineering, Biomedical
Engineering, Computational Biology,
and Biological Sciences
Carnegie Mellon University

Janet M. Catov, PhD, MS
Associate Professor, Department of
Obstetrics, Gynecology and Reproductive
Sciences and Department of Epidemiology
University of Pittsburgh

Liron Pantanowitz, MD
Vice Chairman for Pathology Informatics
and Director of Cytopathology
UPMC Shadyside

Developing a Biomarker for Alzheimer’s Disease Using Machine Learning and Immune Cell Epigenomics

Project Summary

Researchers from Carnegie Mellon University’s School of Computer Science and the University of Pittsburgh School of Medicine are combining areas of expertise and techniques that span machine learning, immunology, and genomics to develop new methods of diagnosing Alzheimer’s disease (AD) that are less expensive than current standards and that may help researchers discover possible treatments.

Currently 5.7 million Americans have AD dementia, including approximately one out of every 10 people over the age of 65. As the population ages, it is projected that 13.8 million Americans will have the disease by 2050. The estimated cost of Alzheimer’s disease is $277 billion annually. Unfortunately, there are still no highly effective treatments or cures for AD.

This team is developing biomarkers that would enable AD to be diagnosed with a blood test, rather than by use of expensive brain imaging techniques. CMU’s Andreas Pfenning has proposed a solution that leverages research in epigenetics and knowledge of the immune mechanisms underlying AD predisposition to develop a novel biomarker. The team will also continue to investigate the impact of genetic changes on AD predisposition and progression of the disease. They hope to leverage these models to inform future research and the design of effective therapeutics for the treatment of AD.


Andreas R. Pfenning
Assistant Professor, Computational Biology
Carnegie Mellon University

Joshua C. Cyktor
Research Assistant Professor, Associate
Director, Virology Laboratory Division of
Infectious Diseases
University of Pittsburgh

John W. Mellors
Distinguished Professor of Medicine; Chief,
Division of Infectious Diseases
University of Pittsburgh Alzheimer’s Disease
Research Center

Diagnosis Coding Engine for Electronic Health Records

Project Summary

Medical diagnostic errors impact 12 million adults each year in the US. The
number of deaths due to medical diagnostic errors is the third leading
cause of death, equivalent to the crash of a large aircraft every day, based
on estimates in the US alone. A key reason why diagnostic errors are
made – even by the best clinicians in highly reliable organizations – is the
increasing complexity of the diagnostic process, with over 10,000 diseases
and 5,000 laboratory tests to choose from.

This project focuses specifically on preventing coding and billing errors.
To address this cognitively complex problem, the team is developing
an engine that will predict likely diagnosis codes based on information
available in a patient’s electronic health record. Specifically, the solution
will review both structured and unstructured data, such as clinical notes,
and apply a machine learning-based mapping from these data to specific
diagnosis codes.

When the diagnosis code suggestions are acknowledged and accepted,
these can then also flow into billing codes through electronic medical
records or other systems. The solution will thus aid clinicians in their
medical management, decision support, accurate documentation, billing
and quality improvement.

Project Status

The research team is continuing to carry out commercially orientated research on this topic.


Pradeep Ravikumar, PhD
Associate Professor, Machine Learning
Carnegie Mellon University

Jeremy Weiss, MD
Assistant Professor, Health Informatics
Carnegie Mellon University

Evaluating the Predictive Capability of Machine Learning Algorithms in Adult Cardiac Surgery

Project Summary

A team from the University of Pittsburgh Medical Center and Carnegie Mellon University is investigating the application of machine learning to clinical cardiac surgery in ways that could have profound implications in clinical trials, therapy selection, patient prognostication and counseling, and surgeon and hospital evaluations.

According to the American Heart Association, one in three Americans have heart disease.  The cost of treating heart disease is expected to triple from $273 billion in 2010 to $818 billion in 2030.  This is largely attributable to the aging population. In addition, a shortage of cardiac surgeons may be at hand, with the number of active cardiothoracic surgeons expected to decline from 3,708 to 3,411 between 2010 and 2025. Taken together, these trends have profound implications for cardiac surgery as a field.  The increased demand coupled with decreased supply will underscore the need for cost-effective, personalized care that optimizes outcomes for patients with surgical heart disease.

Risk modeling has played a vital role for many years in adult cardiac surgery, with the Society of Thoracic Surgeons (STS) database created in 1989 as an initiative for quality improvement and patient safety.  This team of researchers from Pitt/UPMC and CMU is investigating how machine learning might enhance risk modeling. There are opportunities to apply machine learning to data that is already used for risk modeling, and to incorporate vast amounts of additional data that are routinely collected but not used in current models. The ultimate goals of this research are to improve the accuracy of risk models in adult cardiac surgery, introduce new and potentially predictive data, and develop clinical platforms and technology that can provide real-time guidance to physicians in clinical management of cardiac surgery patients to improve outcomes.


Kyle Miller
Project Scientist, Chemical Engineering
Carnegie Mellon University

Arman Kilic, MD
Assistant Professor, Cardiothoracic Surgery
and Director, Surgical Quality and Analytics
for Cardiac Surgery; Co-Director, Center for
Cardiovascular Outcomes and Innovation

In-Home Movement Therapy Data Collection

Project Summary

A team from CMU, Pitt, and UPMC is working on a project to evaluate
the ability of cameras, inertial measurement units, and other sensors, in
combination with machine learning (ML) algorithms, to assess patients’
movement therapy exercises in the home. The long-term goal is to
increase the quality and efficacy of physical therapy (PT) by providing
patients with automated, in-home monitoring, near real-time feedback
on exercise performance, and feedback to providers when issues arise
outside of the PT setting. This will enable providers to prioritize clinic
time for patients whose recovery has stalled while avoiding unnecessary
appointments for those who are progressing satisfactorily.

Project Status

The research team is completing analyses from the project and exploring next steps.


Jessica Hodgins, PhD
Professor, Robotics Institute
Carnegie Mellon University

Dan Siewiorek, PhD
Buhl University Professor,
Electrical and Computer Engineering
and Computer Science
Carnegie Mellon University

Asim Smailagic, PhD
Research Professor, Institute for Complex
Engineered Systems and Director,
Laboratory for Interactive Computer Systems
and Wearable Computers
Carnegie Mellon University

Andrew Whitford, PhD
Senior Research Associate, Robotics Institute
Carnegie Mellon University

Keelan Enseki, MS, PT
Director, Clinical Practice Innovation
UPMC Center for Rehab Services

Adam Popchak, PhD
Research Assistant Professor,
Department of Physical Therapy
University of Pittsburgh

Ingestible Impedance Sensors to Acquire Large-Scale Data Sets from Patients with Eosinophilic Esophagitis

Project Summary

An ingestible sensor may provide a faster, more convenient alternative
to biopsies for diagnosing esophageal disorders, such as eosinophilic
esophagitis (EoE). EoE is an inflammatory disorder of the gastrointestinal
tract that can cause persistent feeding problems, vomiting, and
abdominal pain.

Christopher J. Bettinger of CMU’s Biomedical Engineering Department
is developing an ingestible, flexible sensor that could be used to detect
and monitor EoE and other esophageal disorders. EoE affects more than
150,000 Americans each year and the incidence of EoE is increasing
rapidly, especially in Western Pennsylvania. It is thought to arise from
the patient’s immune response to specific foods and, when properly
diagnosed, can be treated by identifying and eliminating those foods. But
timely diagnosis is challenging because it currently relies on biopsies, which
are slow, expensive, and highly invasive.

If EoE could be reliably monitored on a more frequent basis, the resulting
data could be analyzed in conjunction with a patient’s diet using machinelearning algorithms to identify and eliminate specific foods that may be responsible for causing EoE on a case-by-case basis.

Project Status

The research team is continuing to carry out commercially orientated research on this topic.


Christopher Bettinger, PhD
Associate Professor, Biomedical Engineering
Carnegie Mellon University

Integrating Deep Learning with High Throughput Materials Engineering for Detecting Noroviruses

Project Summary

Researchers from CMU’s Mechanical Engineering Department are
developing technologies for rapidly detecting if surfaces are contaminated
with norovirus, a major cause of epidemic gastroenteritis.

Norovirus, sometimes referred to as the winter vomiting bug, results in
about 685 million cases of disease and 200,000 deaths globally each
year. About half of food-borne disease outbreaks in the United States are
caused by norovirus.

Noroviruses are very contagious, and less than 20 virus particles can cause
an infection. One of the challenges faced by healthcare facilities, like
hospitals, is preventing contamination by and transmission of noroviruses.
The spread of the virus is of particular concern during winter, when
ventilation of buildings may be reduced.

Mechanical Engineering’s B. Reeja Jayan and Amir Barati Farimani are
introducing highly sensitive and specific sensors that can rapidly detect
noroviruses and significantly reduce the chance of these infections and
epidemics, especially in healthcare facilities. Their research combines deep
learning techniques with molecular-scale materials engineering to allow
for quick, sensitive detection of norovirus.


Reeja Jayan, PhD
Assistant Professor, Mechanical Engineering,
Materials Science and Engineering
Carnegie Mellon University

Amir Barati Farimani, PhD
Assistant Professor, Mechanical Engineering
Carnegie Mellon University

Machine Learning Algorithms for Advanced Manufacturing of High-Fidelity 3D Printed Biomaterials

Project Summary

CMU’s Newell Washburn, Philip R. LeDuc, and Jennifer Bone are designing
a tool that would enable the 3D printing of biomaterials, removing one of
the obstacles to using additive manufacturing to produce surgical implants
and organ transplants.

The need for bioprinted implants and organs is significant. At any given
time, nearly 3,500 – 4,000 people are waiting for a heart or heart-lung
transplant, and every 10 minutes a new person is added to the national
transplant waiting list. Furthermore, the prevalence of rejection and
immunosuppression currently impact the success of transplantation and
highlight the need for patient-specific transplants that will increase the
rate of survival. Due to this high demand, the market for patient-specific
implants is projected to reach $1 billion by 2020.

Based on hierarchical machine learning (HML), this work will enable the
manufacturing of high-fidelity 3D constructs from various starting materials, including biological components. The tool will be used by designers, manufacturers, and others who require the ability to integrate information from various models in order to leverage 3D bioprinting for efficient and accurate manufacturing.

Project Status

The research team is continuing to carry out commercially orientated research on this topic.


Newell Washburn, PhD
Associate Professor, Chemistry,
Biomedical Engineering
Carnegie Mellon University

Philip LeDuc, PhD
William J. Brown Professor,
Mechanical Engineering,
Biomedical Engineering, Computational
Biology, and Biological Sciences
Carnegie Mellon University

Jennifer Bone
PhD Candidate, Biomedical Engineering
Carnegie Mellon University

Non-Invasive Intracranial Pressure Monitoring

Project Summary

The healthy brain maintains a relatively constant blood flow to the
brain even when there are changes in blood pressure or intracranial
pressure (ICP). The mechanism preserving blood flow is called cerebral
autoregulation (CA), which is known to be impaired in a variety of diseases,
such as diabetes, Parkinson’s disease, stroke, and traumatic brain injury

ICP now can only be measured by placing an invasive pressure sensor
inside the brain.

This project seeks to establish an optical technique called nearinfrared
spectroscopy (NIRS) as a non-invasive method to monitor ICP
in humans. Development of a non-invasive ICP sensor would optimize
patient treatment in cases where invasive ICP is not possible or could be
dangerous to the patient.


Jana Kainerstorfer, PhD
Assistant Professor, Biomedical Engineering
Carnegie Mellon University

Elizabeth Tyler-Kabara, MD, PhD
Associate Professor, Neurological Surgery,
Bioengineering, Physical medicine and
UPMC Children’s Hospital of Pittsburgh

Real-Time Tool to Predict Clinical Outcomes After Cardiac Arrest

Project Summary

Over 150,000 Americans are treated in hospitals each year after cardiac
arrest, virtually all of whom are initially comatose. Once their hearts are
revived, most such patients eventually die in the hospital from brain injury,
but a sizable minority will awaken and have a good neurological recovery.
With current technology, no sign, symptom, or test result in the first 72
hours after cardiac arrest is sufficient to preclude a favorable recovery.
Prognosis often remains uncertain for days or even weeks thereafter.

The project team is developing technology that will use advanced
signal processing and modeling methods to allow accurate neurological
prognostication sooner than currently possible.

Project Status

The research team is continuing to carry out commercially orientated research on this topic. For more information, reach out to the PHDA at


Daniel Nagin, PhD
Teresa and H. John Heinz III Professor,
Public Policy and Statistics
Carnegie Mellon University

Jonathan Elmer, PhD
Assistant Professor of Critical Care Medicine
and Emergency Medicine
University of Pittsburgh

Cliff Callaway, MD, PhD
Executive Vice Chair, Emergency Medicine
University of Pittsburgh Medical School

Sepsis Phenotyping from Electronic Health Records

Project Summary

Sepsis is a life-threatening organ dysfunction caused by a dysregulated
host response to infection that has a high mortality rate and is a large
burden on the health care industry, constituting over $24 billion annually.

This team will use machine learning and artificial intelligence methods to
develop analytic tools to identify sepsis earlier and define subgroups of
patients who are at high risk for sepsis and share other traits. Successful
completion of this research will advance efforts to identify subgroups with
sepsis for whom particular treatments are more effective, thereby reducing
morbidity and mortality. It will also support the development of physiciancompetitive, health records-based risk scores that can be used for risk stratification and for early warning clinical decision support. This proposal leverages an established collaboration between experts in machine learning and sepsis at CMU, Pitt, and UPMC.


Jeremy Weiss, MD, PhD
Assistant Professor, Health Informatics
Carnegie Mellon University

Christopher Seymour, MD
Associate Professor, Critical Care and
Emergency Medicine
University of Pittsburgh


Towards Automated Multimodal Behavioral Screening for Depression

Project Summary

Depression is a leading cause of disability worldwide. Effective, evidence-based treatments for depression exist but many individuals suffering from
depression go undetected and therefore untreated. Efforts to increase the
accuracy, efficiency, and adoption of depression screening thus have the
potential to minimize human suffering and even save lives.

Recent advances in computer sensing technologies provide exciting
new opportunities to improve depression screening, especially in terms
of their objectivity, scalability, and accessibility. Professor Morency
and Dr. Szigethy are collaborating to develop sensing technologies to
automatically measure subtle changes in individuals’ behavior that are
related to affective, cognitive, and psychosocial functioning.

Their goal is to develop and refine computational tools that automatically measure depression-related behavioral biomarkers and to evaluate the clinical utility of these measurements.


Louis-Philippe Morency, PhD
Finmeccanica Associate Professor,
Language Technologies
Carnegie Mellon University

Eva Szigethy, MD, PhD
Professor, Psychiatry, Medicine and
Pediatrics; Director, Behavioral Health with
the Chief Medical and Scientific Office
University of Pittsburgh

My Healthy Pregnancy

Project Summary

The costs of preterm births to the healthcare system and society are
exceedingly large, and the incidence of preterm births is on the rise
despite medical advances. This has created a need to provide reliable,
personalized information to pregnant women about how they can reduce
their risk of premature births. In response, researchers have created the
“MyHealthyPregnancy” smartphone app.

The app applies statistical machine learning techniques to comprehensive
pregnancy data sets to improve the app’s patient-specific risk predictions
of adverse pregnancy outcomes. It also employs decision science
techniques to extend the app to provide post-partum education and
behavioral feedback to minimize infant mortality outcomes. The team,
which has performed pilot studies of the concept, expects this work will
result in a product that can be tailored to both the UPMC health system
and developed into an off-the-shelf application for the general population
of pregnant women.

Project Status

MyHealthyPregnancy formed a startup called Naima Health.

*UPMC has a financial interest in Naima Health.

For more information, visit their website or reach out to the PHDA here.


Tamar Krishnamurti, PhD
Assistant Research Professor,
Engineering and Public Policy
Carnegie Mellon University
Assistant Professor, Medicine
University of Pittsburgh

Hyagriv Simhan, MD, MS
Professor, Obstetrics, Gynecology and
Reproductive Science; Division Chief,
Maternal Fetal Medicine
Magee Women’s Hospital

Alexander Davis, PhD
Assistant Professor, Engineering
and Public Policy
Carnegie Mellon University

Phylogenetic Models for Predicting Cancer Progression

Project Summary

Despite screening technologies and public health efforts that have
improved the detection of early-stage cancers, methods to reliably
predict which of these cancers will progress to more aggressive stages,
metastasis, and ultimately patient mortality are lacking.

The project team is working on a suite of software tools that clinicians will
use to improve cancer diagnosis and therapeutics based on the molecular
signatures of the patient’s tumor genome.

The goal of the project is to develop novel models, algorithms, and
software tools to better understand the origin and evolution of tumor cells
and how a patient’s tumors are likely to progress. This information would
contribute to personalized, precision treatment of cancer.

Project Status

The research team is continuing to carry out commercially orientated research on this topic.

For more information, reach out to the PHDA here.


Russell Schwartz, PhD
Professor, Biological Sciences and
Computational Biology
Carnegie Mellon University

Jian Ma
Associate Professor, Computational Biology
Carnegie Mellon University

Computational Modeling of Behavioral Rhythms to Predict Readmissions

Project Summary

Readmission after complex cancer surgery is common, with studies
reporting between 15% and 50% of patients being readmitted within 30
days of discharge from the hospital. Readmissions are associated with
increased healthcare costs, poor clinical outcomes including increased
risk of infection and early mortality, and patient and family stress and
suffering. Prior research has identified demographic and clinical predictors
of readmission, but the role of patient-centered behavioral processes
remains relatively unexplored. Identifying behavioral factors associated
with readmission, which might include physical activity levels, sleep habits,
and social contacts, could highlight ways to prevent readmissions and
empower patients to take a more active role in their recovery.

The project team is collaborating to take a generalizable and scalable
approach that holistically looks at patients’ behavior before and after
surgery and identifies routines in this behavior. These routines will form
the basis of their understanding of the predictive factors for readmissions,
particularly while patients are still in the hospital after surgery or other

Harnessing technology to monitor these and other behavioral risks before
surgery, during inpatient recovery, and during the critical transition from
hospital to home will advance the field in a number of ways:

• Reliably assessing and computationally modeling behavioral risks
associated with readmissions to improve risk stratification
• Identifying optimal targets and timing for behavioral intervention to
reduce preventable readmissions
• Providing data that may be helpful and motivating for patients and
that could inform clinical decision making to improve quality of care

Project Status

The research team is evaluating the potential clinical value of their models and continues to develop new analytical techniques related to this study.

For more information, reach out to the PHDA here.


Anind Dey, PhD
Charles M. Geschke Director, Human-
Computer Interaction Institute
Carnegie Mellon University

Carissa Low, PhD
Assistant Professor, Hematology/Oncology,
Center for Behavioral Health and Smart
University of Pittsburgh

Programming Framework for Managing Patient Privacy

Project Summary

As health data is increasingly digitized, the need to protect patient privacy
is unprecedented. The Programming Framework for Managing Patient
Privacy team is collaborating on an approach to protect patients’ health
records and computations involving sensitive patient data from being
leaked in a breach.

Project researchers propose a programming model and software
framework for managing the privacy of health data. The programming
model allows programs to attach privacy policies directly to the data, rather
than requiring the programmer to implement the correct privacy checks
across code that uses the data. Attaching privacy policies to the data
enables programs to be policy-agnostic and to be updated independently
of the specification and implementation of privacy policies. The system
tracks the flow of data, as well as the policies, so the programmer need
not do so. This approach facilitates auditing, as policies may now be
centralized and to be implemented only once, instead of as checks across
the program. Auditors may also leverage the fact that the framework,
rather than the programmer, is now responsible for correctly implementing
the checks.

This project is the first step in a long-term plan to improve patient care
and promote scientific discovery by allowing sensitive health data to be
shared safely. This plan involves 1) applying prior work on policy- agnostic
programming for information flow policies in the context of patient health
records, 2) extending and combining prior work to support more expressive policies, and 3) building a practical framework that can be used to safely share sensitive data.

Project Status

The research team is completing analyses from their project and exploring next steps.

For more information, reach out to the PHDA here.


Jean Yang, PhD
Assistant Professor, Computer Science
Carnegie Mellon University

Matt Fredrikson, PhD
Assistant Professor, Computer Science and
Institute for Software Research
Carnegie Mellon University

Jian Ma
Associate Professor, Computational Biology
Carnegie Mellon University

Healthcare Trials and Care Coordination Using Admissions, Discharges, and Transfers Data

Project Summary

This project is determining if coordinated care decisions could be
improved by analyzing and monitoring how patients move through
the healthcare system.

They will be focusing on healthcare trails – time-ordered sequences
in which a patient obtains services such as dialysis, blood work, or
psychological counseling. These trails will be analyzed to identify those
that are commonly used by patients with similar conditions and those
that most often improve patient conditions. Factors such as a patient’s
age, healthcare coverage, and medical condition can affect the flow of
these services, and differences in flow can be associated with different
patient outcomes.

The goal is to develop, test, and assess algorithms and interfaces
for analyzing hospital admissions data and for providing feedback to
doctors and caregivers through automated patient tracking and notification
systems. By analyzing the timing of admits, discharges, and transfers,
the application could help inform care givers about how a patient’s care
is being managed and detect when a patient might be at risk because
of the healthcare trail they are on.

Project Status

The Health Care Trails research team is examining the outputs from this project and potential future directions.

For more information, reach out to the PHDA here.


Kathleen Carley, PhD
Professor, Institute for Software Research
Carnegie Mellon University

L. Richard Carley, PhD
Professor, Electrical and Computer
Carnegie Mellon University

Detecting Intestinal Activity by Analyzing Gut Sounds

Project Summary

A non-invasive device that analyzes sounds from the intestinal tract could
become a powerful new tool to help physicians diagnose and monitor a
variety of gastric illnesses, such as acute pancreatitis, bowel obstructions,
irritable bowel syndrome, inflammatory bowel disease, and Crohn’s

CMU researchers, collaborating with UPMC gastroenterologists,
will be developing, testing, and performing clinical research on a wearable
sensor array to detect intestinal sounds – a new vital sign for the “gut.” Its
impact could be substantial: In the U.S. alone, digestive diseases afflict 60-
70 million people annually with treatment costs totaling more than $100

Using machine learning techniques, the researchers will develop a means
of interpreting the acoustic signals to examine the determinants of gut
activity and activity suppression, the ability to identify gastric disorders on
the basis of sounds, and the device’s therapeutic potential for enabling
gut regulation via biofeedback. The project will also develop classification
procedures that use these features to distinguish between the various
types of gastrointestinal activity.

Project Status

The research team has wrapped up this project and is pursuing new directions and applications for their innovative approaches.

For more information, reach out to the PHDA here.


George Loewenstein, PhD
Herbert A. Simon University Professor,
Economics and Psychology
Carnegie Mellon University

Ali Momeni, PhD
Associate Professor, Art
Carnegie Mellon University

Rich Stern, PhD
Professor, Electrical and Computer
Carnegie Mellon University

Max Gsell, PhD
Associate Professor, Statistics
Carnegie Mellon University

Valerie Ventura, PhD
Associate Professor, Statistics
Carnegie Mellon University.


Carl Kingsford, PhD

Executive Director

Carl Kingsford, PhD, is the Executive Director for the Center for Machine Learning and Health. Carl holds the rank of Full …

read more

Carl Kingsford, PhD, is the Executive Director for the Center for Machine Learning and Health. Carl holds the rank of Full Professor in the Computational Biology Department at Carnegie Mellon University. He received his PhD in Computer Science from Princeton University in 2005 and was an Assistant Professor at the University of Maryland before joining CMU. His main research interests involve key problems within computational systems biology and genomics, such as predicting protein function, reconstructing ancient biological pathways, and improving genome assembly.


Joe Marks bio photo

Joe Marks, PhD

Advisory Board

Joe Marks, PhD, moved to a new role as a member of the Center for Machine Learning and Health Advisory …

read more

Joe Marks, PhD, moved to a new role as a member of the Center for Machine Learning and Health Advisory Board after leaving his role as the Executive Director. Joe has been involved with invention and innovation for more than 30 years in multiple contexts (academic, corporate, entrepreneurial) and multiple industries (defense, computers, electrical equipment, media & entertainment, marketing, healthcare). He holds an AB in Applied Mathematics and a PhD in Computer Science, both from Harvard University. His areas of technical expertise include AI, HCI, and CG.


Ari Lightman bio photo
Ari Lightman

Commercialization Advisor

Distinguished Service Professor of Digital Media and Marketing, Heinz College, Carnegie Mellon University

Ari Lightman is the Commercialization Advisor for the Center for Machine Learning and Health. Ari is a Distinguished Service Professor, Digital Media and Marketing …

read more

Ari Lightman is the Commercialization Advisor for the Center for Machine Learning and Health. Ari is a Distinguished Service Professor, Digital Media and Marketing at Carnegie Mellon University’s Heinz College of Information Systems and Public Policy. He instructs physicians within CMU’s Master’s Medical Management program and is the faculty lead for Accelerate IT, a joint executive education program between The Tepper Business School and Roche. He received his MBA from Carnegie Mellon University, MSc in Engineering from the University of Pittsburgh and a BSc in Molecular Biology from the University of Toronto. Ari and his family reside in Pittsburgh, PA.


Heather Johnson bio photo
Heather Johnson

Manager of Operations

Nicole Flynn bio photo
Nicole Flynn

Project Services Manager

Subscribe to our newsletter to get the latest Alliance news, event invitations and updates, and behind-the-scenes spotlights on our researchers, projects, and team members.

By clicking Submit, I agree to the Pittsburgh Health Data Alliance's Terms of Use and Privacy Policy.