Center for Causal Discovery
Datathon Winners for 2017

The Center for Causal Discovery (CCD) held the second annual datathon at the end of the 2017 Summer Short Course on Causal Discovery at Carnegie Mellon University, Pittsburgh, Pennsylvania. The five day course from June 12 to 16 included three and a half days on Causal Discovery from Biomedical Data and one and a half days on the Causal Discovery Datathon.

The short course on causal discovery with biomedical data was meant for both biomedical and data scientists at the level of graduate students and up who sought training in causal discovery and graphical modeling of biomedical data. The course was directed by Dr. Richard Scheines, who gave a series of didactic lectures in causal modeling and discovery.

“The course was attended by about 75 extremely bright and energetic participants, ranging from graduate students in data science to senior policy administrators in public health,” said Dr. Scheines.

Also included were several breakout groups that considered several specific data analysis topics, including discovery of cancer genomic drivers, the causes of severity and progression in chronic obstructive pulmonary disease, and the functional (causal) connectivity of the human brain.

Datathon Winners:

First Prize – $250

Imagining Genetics for Alzheimer’s Disease
Sjoerd Huisman

Sjoerd Huisman was able to create Imaging Genetics for Alzheimer’s Disease. The imagining highlights “the relationship between genetic variant single nucleotide polymorphisms (SNPs), the sizes in several brain areas, and disease status.” Using data from the Alzheimer’s Neuroimaging Initiative (ADNI), the team built a causal model by using the Fast Causal Inference (FCI) algorithm within the Tetrad system ( To find out more about this project visit:

Second Prize – $150

Construct Gene Regulatory Network (GRN)
Duc Do, Marquette University
Linh Pham, University of Wisconsin

The goal of this submission was to construct a gene regulatory network using both normal and cancer tissue data.  The submitters wanted to know what factors (gene regulators) cause differences in gene expression of normal vs. cancer tissue.  They used breast cancer data from the Cancer Genome Atlas (TCGA).  In order to construct their network they used mRNA expression, expression of its potential miRNA regulators, expression of its potential transcription factors, copy number alteration (CNA), and DNA methylation level data and the Fast Greedy Equivalence Search (FGES) algorithm in R causal  (

Third Prize – $100

Brain Effective Connectivity in fNIRS
Samuel Antonio Montero Hernández
Instituto Nacional de Astrofisica, Optica y Electronica (INAOE), Mexico

The brain is one of the most complex dynamic system in the nature. Typically, the connectivity in the brain is studied across three different levels: anatomical, functional,and effective connectivity. The anatomical connectivity address the physical links among neuronal population. The functional connectivity is interested in determining the dependencies between brain regions. And, effective connectivity is concerned to the study of the influence of one region over another by determining the causal direction of such influence. The aim of this work is to determine the set of effective connectivity networks (ECN) while a group of subjects perform set of specific tasksand their brain activity is monitored with a functional near-infrared spectroscopy neuroimaging machine (fNIRS).  This work used the IMaGES algorithm in Tetrad  (