Causal Discovery from Biomedical Data – June 13-17, 2016, Carnegie Mellon University, Pittsburgh, PA
Location: Baker Hall, Giant Eagle Auditorium, A51, CMU
Course Director: Richard Scheines, PhD
Application of causal modeling on chronic lung disease
Causal Discovery Datathon – June 17-18, 2016, Carnegie Mellon University, Pittsburgh, PA
Location: Baker Hall, Giant Eagle Auditorium, A51, CMU
Datathon Director: Jeremy Espino, MD, MS
Attendees, either biomedical or data scientists, are expected to have knowledge of basic statistical principles, but no prior graphical modeling experience is needed. You can review presentations from the 2015 Summer Short Course on Causal Discovery from Biomedical Data and the Causal and Statistical Reasoning online course to prepare for the 2016 Short Course and better understand the material to be presented. If you took the 2015 Short Course or the CCD workshop at the AMIA Joint Summit in San Francisco (March 23, 2016) or have prior background in causal discovery techniques, you do not need to attend the first day (June 13).
You must bring a laptop and download Tetrad software prior to arriving, as each session will have hands-on tutorials and examples.
We also encourage you to bring your own dataset to use in learning how to apply CCD software and causal discovery algorithms. Course faculty will help you determine the best algorithms for your data and research questions. You will be able to run analyses using the Pittsburgh Supercomputing Center resources if you have large complex data. If you do not bring your own data, we will provide a practice dataset appropriate for your research interests.
If you bring your own data, we encourage you to stay through Friday afternoon and Saturday to participate in our Causal Discovery Datathon, where you can earn fame and fabulous prizes while having fun analyzing causal associations in your findings.
There is no charge to attend the Summer Short Course, which includes one poster session-dinner (June 15), continental breakfasts, and beverage/snack breaks. You will be responsible for other meals, travel expenses, and housing. The Datathon is also free and includes a pizza party dinner and continental breakfast.
A block of CMU dorm rooms (40 at $56/night) and a small block of hotel rooms (9 at $349/night) are available for reservation; please note that because the US Open is being held in Pittsburgh the same week, no discounted hotel rate is available. Dorm or hotel rooms must be reserved by May 23, 2016.
Complimentary shuttle service from the Wyndham is provided to the CMU campus on request. Please use the CMU Directions page for information on traveling to the venue site. Please direct any logistics questions to Toni Porterfield.
Causal Graphical Models
Giant Eagle Auditorium, Baker Hall
Morning (9:00 am – 12:00 pm) – Registration will be open at 8:00 am
- Introduction
- Overview of causal graphical models
- Loading Tetrad
- Causal graphs/interventions
- Building Models
- Parametric models: Bayes nets, SEM, other (e.g., generalized SEM)
- Instantiated models: Bayes net, SEM, generalized SEM
Lunch (12:00 – 1:30 pm) – on your own
Afternoon (1:30 – 4:00 pm)
- Estimation and Model Fit
- Estimation
- Inference
- Model fit
- Hands-on Real-Data Examples
Dinner on your own.
Causal Graphical Models
Giant Eagle Auditorium, Baker Hall
Morning – (9am – 12:00 pm) – Search I
- D-Separation
- Model equivalence
- Basic search algorithms
- Hands-on real-data examples
Lunch (12:00 – 1:30 pm) – on your own
Afternoon (1:30 – 4:00 pm)– Breakout Groups
- 1:30 – 1:50 – Introduction to additional software and Bridges supercomputer
- 1:50 – 2:00 – Breakout into workshop groups by interest and data type
- 2:00 – 3:00 – Instructor led group breakout
- Load data, specify and test a hypothesis
- 3:00 – 4:00 – student free time with breakout instructor and TA(s)
Dinner on your own.
Biomedical Causal Discovery Overview
Giant Eagle Auditorium, Baker Hall
Morning (9:00 am – 12:00 pm) – Search II
- Latent variable search algorithms
- Time series
- Issues involving measurement
Lunch (12:00 – 1:30 pm) – on your own
Afternoon (1:30 – 4:00 pm) – Breakout Groups
- 1:30 – 2:30 – instructor led group breakout
- Enter background knowledge
- Perform basic searches
- 2:30 – 4:00 – student free time with breakout instructor and TA(s)
Evening: Short Course Dinner and Poster Session (included with registration)
O’Hara Student Center, 2nd Floor Ballroom, University of Pittsburgh, 4024 O’Hara Street
- 5:30-6:15 pm – Reception and Poster Session
- 6:15-8:00 pm – Dinner with keynote speaker Dr. Gregory Cooper
Case Studies
Giant Eagle Auditorium, Baker Hall
Morning (9am – 12:00 pm) – Biomedical Case Studies
- fMRI (brain functional connectome)
- Cancer genomic drivers
- Lung disease pathways (susceptibility & progression)
- Genetic regulatory network examples
Lunch (12:00 – 1:30 pm) – on your own
Afternoon (1:30 – 4:00 pm) – Breakout groups
- 1:30 – 2:30 – instructor led group breakout
- Finalize searches
- Summarize results
- Create short report
- 2:30 – 4:00 – student free time with breakout instructor and TA(s)
Dinner on your own.
Breakout Group Reports & Discussion
Giant Eagle Auditorium, Baker Hall
Morning (9:00 am – 12:00 pm) – Wrap Up & Discussion
- Breakout group reports
- Course question-and-answer period, discussion, and wrap-up
- Evaluations
Afternoon: Causal Discovery Datathon Begins at 1:00 pm in the same location
(Giant Eagle Auditorium, Baker Hall)
Jeremy Espino, MD, MS, Director
Immediately following the Short Course, we will hold a Datathon designed to instruct and challenge biomedical researchers on the use and application of causal modeling and discovery tools in a “bring your own data event”. You will be required to use our CCD software, including our API (Java, R, Python) and command-line interfaces for Fast Greedy Search (FGS) and any other causal discovery algorithms available for use in June.
A panel of data scientists from our CCD Driving Biomedical Project teams will judge the analyses and award prizes based on size and complexity of data, impact with regard to the causal hypotheses generated, and innovation in the use of CCD tools.
You will need to download CCD software and perform preliminary data formatting in advance of the Datathon. If you attend the Short Course, you will have already completed this process. If you are attending only the Datathon, we will send you formatting specifications for your data so that you can prepare them in advance.
Causal Discovery Datathon
1:00 pm | Datathon introduction |
1:30 pm | Team introductions and pairing of teams with support staff |
2:00 pm | Data preparation discussion – discretization, data filtering, distributions |
3:00 pm | Instruction on supercomputing resources available to Datathon teams |
3:30-5:45 pm | Data analysis using CCD tools |
6:00-8:00 pm | Group pizza dinner & networking |
9:00 am | Breakfast with question and answer session |
10:00 am – noon | Data hacking |
12:00 pm – 1:00 pm | Lunch (on your own) |
1:00 – 3:00 pm | Data hacking & visualization |
3:00 | Participant presentations |