To help discover valid, novel, and significant causal relationships in big biomedical data that lead to new insights in health and disease.

As an inaugural member of the NIH Big Data to Knowledge (BD2K) Consortium, the Center for Causal Discovery (CCD) will:

  • Develop highly efficient causal discovery algorithms that can be practically applied to very large biomedical datasets
  • Conduct projects addressing 3 distinct biomedical questions (cancer driver mutations, lung fibrosis, brain causome) as a vehicle for algorithm development and optimization
  • Disseminate causal discovery algorithms, software, and tools
  • Train data scientists and biomedical investigators in the use of CCD tools
  • Train data scientists and biomedical investigators to collaborate in the discovery of causality

Led by Drs. Gregory Cooper, Ivet Bahar, Jeremy Berg, and Clark Glymour (see figure below), the Center represents a partnership among data scientists from the University of Pittsburgh (Pitt), Carnegie Mellon University (CMU), and the Pittsburgh Supercomputing Center (PSC) who will develop the algorithms, software, and system architecture needed by biomedical scientists seeking to discover and represent causality using their large and diverse data sets. We are joined by collaborators from Yale University, California Institute of Technology, Rutgers University, Stanford University, the University of Crete, and the University of North Carolina. We receive guidance and insight from exceptional Internal and External Advisory Boards.

We invite you to explore our tools, our tutorials, and our projects both within the Center and with other BD2K Consortium members to find what you need to start discovering new causal knowledge in your own data.

You can find more information about the CCD and other centers in the BD2K WOW Stories or join the weekly lecture series: The BD2K Guide to the Fundamentals of Data Science Series.Overall Big Questions v3 small

(Click to view larger images)

The Center was featured in the following article from the Journal of the American Medical Informatics Association:

The Center for causal discovery of biomedical knowledge from Big Data
Gregory F. Cooper; Ivet Bahar; Michael J. Becich; Panayiotis V. Benos; Jeremy Berg; Jeremy U. Espino; Clark Glymour; Rebecca Crowley Jacobson; Michelle Kienholz; Adrian V. Lee; Xinghua Lu; Richard Scheines;
Journal of the American Medical Informatics Association 2015;1-6.
doi: 10.1093/jamia/ocv059

We are supported by the National Institutes of Health (U54HG008540) and the National Science Foundation (1445606, 1261721). Center members are asked to use the following acknowledgment in all publications and presentations:

Research reported in this publication was supported by grant U54HG008540 awarded by the National Human Genome Research Institute through funds provided by the trans-NIH Big Data to Knowledge (BD2K) initiative ( The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.