We are implementing an integrated set of methods that support the graphical representation, discovery, and application of causal knowledge from large and complex biomedical data (see samples of structural causal relationships in figure).


We are using two major classes of algorithms: constraint-based algorithms and Bayesian algorithms.

In partnership with our Systems Architecture group, we are also optimizing highly efficient versions of these algorithms (e.g., emphasizing parallelization), so that they are practical to apply to such challenging data.Causal_representation_figure

We are evaluating our algorithms for discovery accuracy and efficiency using both real and simulated data. Our three driving biomedical projects – cancer, lung, brain – provide real data with which to develop and optimize our algorithms.

We will evaluate our algorithms and optimized system architecture for usability and acceptability by the biomedical investigators who use our software interface to apply our algorithms to their data. Feedback from their evaluations will drive improvement of the system and its user interface in a continuous feedback cycle.

Our library will comprise the best causal discovery algorithms reported in the literature, including algorithms we have developed, and new causal discovery algorithms needed to support the analysis of large and complex biomedical datasets.



The CCD Causal Software suite offers easy to use software for causal discovery from large and complex biomedical datasets, applying Bayesan and constraint based algorithms. It includes a web application as well as API’s and a command line version.

Causal Web

our user-friendly web application for performing causal discovery analysis on big data using large memory servers at the Pittsburgh Supercomputing Center. Use this software if you want to quickly try out a causal discovery algorithm or if you have big data that cannot be analyzed on your local hardware.

User guide Web app

Causal Command

a Java library and command line implementation of algorithms for performing causal discovery on big data. Use this software if you are interested incorporating analysis via a shell script or in a Java-based program. The ‘Software’ button below leads to a comprehensive repository. Choose the ‘causal-cmd-x.x.x -jar-with-dependencies.jar’ from the downloads list when using this as an executable via the command line or as an API in a Java program.

The software currently includes:

User guide Software

Py Causal

(early release) – a python module that wraps algorithms for performing causal discovery on big data.

User guide Software Docker

(early release) – an R module that that wraps algorithms for performing causal discovery on big data.

The software currently includes:

User guide Software Docker


TETRAD is a desktop java application that can connect to outside super computing resources if neccesary which creates, simulates data from, estimates, tests, predicts with, and searches for causal and statistical models. The aim of the program is to provide sophisticated methods in a friendly interface requiring very little statistical sophistication of the user and no programming knowledge.

User guide Tetrad

Documentation on our algorithms can be found here.

Please report any bugs that you might encounter on our issue trackers as part of our software repositories on Github

Please also sign up for our CCD User Group listserv to receive updates on software releases, training events, hackathons, and datathons.

If you use our software in your research, please acknowledge the Center for Causal Discovery, supported by grant U54HG008540, in any papers, presentations, or other dissemination of your work.

All software is open-source and released under a dual licensing model. For non-profit institutions, the software is available under the GNU General Public License (GPL) v2 license. For-profit organizations that wish to commercialize enhanced or customized versions of the software will be able to purchase a commercial license on a case-by-case basis. The GPL license permits individuals to modify the source code and to share modifications with other colleagues/investigators. Specifically, it permits the dissemination and commercialization of enhanced or customized versions as well as incorporation of the software or its pieces into other license-compatible software packages, as long as modifications or enhancements are made open source.

The above software are early release versions. By using software provided by the Center for Causal Discovery, you agree that no warranties of any kind are made by Carnegie Mellon University or the University of Pittsburgh with respect to the data provided by the software or any use thereof, and the universities hereby disclaim the implied warranties of merchantability, fitness for a particular purpose, and non-infringement. The universities shall not be liable for any claims, losses, or damages of any kind arising from the data provided by the software or any use thereof.