2018 Summit

Artificial Intelligence in Environmental Health Science and Decision Making



15 TW Alexander Dr, Durham, NC 27703

Thursday Afternoon, October 18
Noon-1 pm    Registration
1-5 pm            Brief welcome followed by BayesiaLab seminar

Friday, October 19
8-8:30              Registration
8:30-8:40         Brief Opening
8:40-10:00         Opening Plenaries (Tom Dietterich, Paul Whaley, Samuel Adams)
10:00-10:15       Break
10:15-Noon       5 Case Studies Addressing Key Questions (15-20 minutes each)
11:45-1:00       Lunch
1:00-3:00         2 to 3 Breakout Sessions further answering all these questions and developing some recommendations
3:00-3:45         Share Findings back in Auditorium
3:45-4:00         Conclusion



BayesiaLab Workshop
“Currently, Bayesian Networks have become one of the most complete, self-sustained and coherent formalisms used for knowledge acquisition, representation and application through computer systems.” (Bouhamed et al., 2015)

In this workshop, we illustrate how scientists in many fields of study — rather than only computer scientists — can employ Bayesian networks as a very practical form of Artificial Intelligence for exploring complex problems. We present the remarkably simple theory behind Bayesian networks and then demonstrate how to utilize them for research and analytics tasks with the BayesiaLab software platform. More specifically, we illustrate supervised and unsupervised machine learning algorithms for knowledge discovery in high-dimensional domains.

Also, while Artificial Intelligence is commonly associated with another buzzword, “Big Data,” we show that Bayesian networks can bring Artificial Intelligence to problems for which we possess little or no data. Here, expert knowledge modeling is critical, and we describe how even a minimal amount of expertise can serve as a basis for robust reasoning under uncertainty with Bayesian networks.

The workshop’s examples can also be found in Chapters 4, 6, and 7 in our book, Bayesian Networks & BayesiaLab: A Practical Introduction for Researchers, which can be downloaded free of charge (bayesia.com/book)

Workshop Overview

Why Bayesian Networks?

  • What is Artificial Intelligence?
  • Why do we build models? To explain or to predict?
  • The Bayesian network paradigm as a unifying framework
  • How does this relate to Artificial Intelligence?
  • Artificial Intelligence in Practice:
    • Expert knowledge modeling and reasoning under uncertainty
    • Supervised & unsupervised machine learning for knowledge discovery


Modern Machine Learning: Probabilistic Modeling and Functional Prediction
Tom Dietterich, Distinguished Professor Emeritus, Oregon State University, School of Electrical Engineering and Computer Science

Machine learning pursues two main paradigms for data analysis: probabilistic programming and function fitting. Probabilistic programming methods provide rich languages for defining probabilistic models and efficient algorithms for fitting those models to data. Function fitting algorithms seek to fit a highly-accurate prediction function drawn from a highly-flexible (non-parametric) class of functions.  Building on the Bayesia tutorial from the previous afternoon, this talk will illustrate probabilistic programming by discussing multilevel modeling in the probabilistic programming language Stan. Then the talk will describe the random forest method and present recent techniques for making inferences based on these versatile models. Finally, the talk will discuss deep neural networks and their potential application to problems of environmental health science. These novel methods are best applied to problems of converting unstructured data (images, text) to structured data for subsequent statistical analysis. They also have potential to revolutionize medical imaging.

Systematic Reviews, Machine Learning and the Liberation of Knowledge from Information in Environmental Health Research
Paul Whaley, Evidence Based Toxicology Collaboration Research Fellow, Lancaster Environment Centre, Lancaster University, United Kingdom

Whaley is an experienced advocate of the use systematic methods for reviewing evidence to support the development of environmental policy and regulation, working with a range of US and European regulators, industry, NGO and academic organisations including the US Environmental Protection Agency, the European Food Safety Authority, the Evidence Based Toxicology Collaboration and the World Health Organisation. Paul is Associate Editor for Systematic Reviews at the journal Environment International (IF 7.088), and is based at Lancaster Environment Centre in the UK.

As a researcher, Paul is developing best-practice frameworks, critical appraisal tools, and quality control interventions for improving the standard of published systematic reviews in environmental health and toxicology journals. In response to the challenges which exponential growth in research publishing presents to the conduct of high-quality, useful systematic reviews, Paul is also working on novel approaches to data extraction and storage, to convert data currently locked away in thousands of individual documents into ultra-connected, queryable databases, and artificial intelligence approaches for automating evidence surveillance and synthesis.

Environmental Health … in Context
Samuel Adams, Senior Artificial Intelligence Researcher, RTI International

Why isn’t the Holy Grail of data science and AI standing before us right now filled to overflowing with all the knowledge and insight we “just know” is there? One of the biggest reasons is the lack of context.  The environment is a vast, intermeshed metasystem of metasystems where nearly everything is connected to, and interacts with, everything else, at least at some distance across a network of intermediary systems. New tools like Deep Learning hold great promise, but without putting both the inputs AND the outputs into a greater integrated context, the ultimate value delivered by our collective efforts will still be minimal. This talk will discuss both the opportunities for developing and maintaining large scale contextual knowledge graphs as well as approaches to overcoming the technical challenges along the way.

Presentations and Panels Addressing the Following Questions

What data sets are currently available and what kind of data sets do people need to start collecting?

Highlight the applications via some case studies

How to bring these techniques and methods to scientists and decision makers. How to train those people working in risk assessment. How to have them understand that this is a different paradigm.

What are the curriculum adjustments?



Harnessing Machine Learning to Predict Toxicities
Nicole Kleinstreuer, Deputy Director, National Institute of Environmental Health Sciences, NTP Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM)

Traditional toxicology has relied upon testing chemicals in animals, methods which are costly, time-consuming, low-throughput, and often provide little insight into the mechanisms by which chemicals might affect human health and disease pathways. Toxicology testing in the 21st century, as demonstrated by the federal Tox21 research consortium, provides new methods to rapidly and reliably test tens of thousands of chemicals in a vast array of cellular, molecular, and genetic targets. When combined with large, well-curated reference datasets for substances with demonstrated toxic potential, machine learning approaches such as support vector machines, random forests, and deep learning can be used in supervised and unsupervised ways to predict complex toxicities and find associations between chemical structure and mechanisms. Such models demonstrate the power of artificial intelligence to inform regulatory decision making and help industry design safer, more sustainable products.


Using Bayesian networks to discover relations between genes, environment, and disease
Mark Borsuk, Pratt School of Engineering, Duke University

We review the applicability of Bayesian networks (BNs) for discovering relations between genes, environment, and disease. By translating probabilistic dependencies among variables into graphical models and vice versa, BNs provide a comprehensible and modular framework for representing complex systems. We first describe the Bayesian network approach and its applicability to understanding the genetic and environmental basis of disease. We then describe a variety of algorithms for learning the structure of a network from observational data. Because of their relevance to real-world applications, the topics of missing data and causal interpretation are emphasized. The BN approach is then exemplified through application to data from a population-based study of bladder cancer in New Hampshire. We find that allowing for network structures that depart from a strict causal interpretation enhances our ability to discover complex associations including gene-gene (epistasis) and gene-environment interactions. While BNs are already powerful tools for the genetic dissection of disease and generation of prognostic models, there remain some conceptual and computational challenges. These include the proper handling of continuous variables and unmeasured factors, the explicit incorporation of prior knowledge, and the evaluation and communication of the robustness of substantive conclusions to alternative assumptions and data manifestations.

Machine Learning in Dose-Response Assessment:  Translating Science to Decisions
Jackie MacDonald Gibson, Associate Professor, UNC Chapel Hill, Department of Environmental Sciences and Engineering; RTI University Scholar, 2017-2018

In the United States, decisions about whether to implement new drinking water regulations are based on quantitative risk assessments of the potential health benefits of these regulations.  The methods used for these risk assessments do not yet incorporate major advances in artificial intelligence that could improve risk predictions.  To demonstrate the potential advantages of adopting artificial intelligence methods for U.S. environmental regulatory risk assessment, this talk will present a case study predicting the benefits of decreasing arsenic exposure in drinking water using a machine-learned Bayesian belief network model.  The model was learned from a data set of 1,050 individuals from an arsenic-endemic region of Chihuahua, Mexico.  The model integrates arsenic exposure data with biomarkers of arsenic metabolism and demographic characteristics to quantify the probability of diabetes for different exposure levels and population subgroups.  The predictive ability of the Bayesian network model will be compared to that of a reference dose model and of a model estimated with Benchmark Dose Software, which are the prevailing approaches in current U.S. regulatory risk assessment practice.  Implications for policymaking will be discussed.

Bayesian Inference for Substance and Chemical Toxicity (BISCT)
Lyle D. Burgoon, Leader, US Army Engineer Research and Development Center, Bioinformatics and Computational Toxicology Group

Lyle Burgoon leads research on Artificial Intelligence to Drive the Military Environment. The focus of Dr. Burgoon’s work is the creation of Artificial Intelligence that can augment human decision-making in primarily military and humanitarian environments. Dr. Burgoon is an expert in AI sensor fusion (sensors including Internet of Things sensors, ground-based sensors, space-based sensors, laboratory-based sensors) to augment human decision-making in difficult national security settings. Environmental public health challenges that Dr. Burgoon works on include predicting the impacts of the environment and environmental changes on military intelligence and warfighter readiness, AI sensor fusion to understand urban warfare environments and health impacts, to technologies for food security and logistics, to forecasting the potential secondary and tertiary impacts on human health as a result of warfare, and the automated global identification of human infrastructure networks anywhere in the world. Dr. Burgoon’s recent work also includes forecasting potential toxicity of military materials based on structural information, and the use of Bayesian Networks to fuse data from laboratory assays to predict potential toxicity. Prior to joining the US Army, Dr. Burgoon was a Branch Chief and senior science advisor at the US Environmental Protection Agency’s National Center for Environmental Assessment.

Rare Diseases and AI Analysis with Potential Use of the Environmental Genome
Dr. Michael Kowolenko, CEO, NoviSystems
Dr. Michael Overcash, Executive Director, Environmental Genome Initiative

The ability to develop analytical infrastructure that allows for the fusion of different data sets can result in applications that allow users to collect information and draw conclusions in a less cumbersome and confusing fashion.  This application of “backend” compute techniques such as machine learning and natural language processing allows complex relationships to be explored as a method to convey complex relationships to individuals with Rare Diseases. In a related effort, the individuals with an occurrence of rare diseases are grouped into familial or nonfamilial categories. The latter can be looked at with several analytics built on the Environmental Genome which estimates specific chemical emissions from manufacturing plants, transportation areas, the energy grid locations, and agriculture. The future challenge is how to link these chemical sources to the exposure zones of those with nonfamilial rare diseases. This is the first step in a much more transparent, but complicated approach to environmental pollutants and the use of artificial intelligence.



Samuel Adams, RTI International
Surafel Adere, Duke
Michelle Angrish, US EPA
Martin Armes, The Collaborative
Scott Auerbach, NIEHS
Maureen Avakian, MDB Inc.
David Aylor, NC State
Maryam Azimi, Lenovo
Mamta Behl, NIEHS
Shannon Bell, ILS
Alexandre Borrel, NIEHS
Mark Borsuk, Duke
David Brown, The Collaborative
Lyle Burgoon, US Army
Neal Cariello, Integrated Laboratory Systems
Xialquing Chang, Integrated Laboratory Systems
Rada Chirkova, NC State
Sarah Catherine Colley, UNC Chapel Hill
Gwen Collman, NIEHS
Stefan Conrady, Bayesia USA
Jesse Cushman, NIEHS
Sivanesan Dakshanamurthy, Georgetown
Sally Darney, NIEHS
Deepa Dawadi
Demosmita, NC State
Rob DeWoskin, etioLogic
Tom Dietterich, Oregon State
Cedric Dongmo, University of Yaounde
Chris Duncan, NIEHS
Steve Dutton, US EPA
Steve Edwards, RTI International
Neeraja Erraguntla, American Chemistry Council
Jianing Fan, Duke
Lydia Feinstein, Social & Scientific Systems
Kenda Freeman, MDB Inc.
Jim French, Live Learn Innovate Foundation
Stavros Garantziotis, NIEHS
Patrick Gray, Duke
Lauren Gridley, RTI International
Hanbing Guan, Cisco
John Hardin, NC Board of Science, Technology & Innovation
Linchen He, Duke
Gina Hilton, PETA International Science Consortium
Stephanie Holmgren, NIEHS
Kennedy Holt, UNC Chapel Hill/NC Public Health
Beibei Hu, Duke
Sara Imhof, NC Biotech Center
Jaronda Ingram, Zoetis
Kristin Inman, NIEHS
Agnes Janoshazi, NIEHS
Peer Karmaus, NIEHS
Chandana Kasireddy, NIEHS
Channa Keshava, US EPA
Manal Khan, UNC Chapel Hill
Nicole Kleinstreuer, NIEHS
Les Klimczak, NIEHS
Michael Kowolenko, NoviSystems
Hamid Krim, NC State
Dhirendra Kumar, NIEHS
Archana Lamichhane, NIEHS
Christopher Lavender, NIEHS
Janice Lee, US EPA
Tess Leuthner, Duke
Jian-Liang Li, NIEHS
Xing Li, Duke
Yuanyuan Li, NIEHS
Rui Liu, Social & Scientific Systems
Yun Liu, UNC Chapel Hill
Ming Lu, Health Canada
Jackie MacDonald Gibson, UNC/RTI
Alexandra Maertens, Johns Hopkins
Elizabeth Mannshardt, US EPA
Dwi Sianto Mansjur, IBM
Kamel Mansouri, ILS
Courtney McCortsin, Duke
Sena McCrory, Duke
Vanessa Michelou, Novozymes
Erika Munshi, Duke
Reshma Nargund, Duke
Emmanuel Obeng-Gyasi, NC A&T
Michael Overcash, Environmental Genome Initiative
Okan Pala, NC State
Shannon Parker, Duke
Rajneesh Pathania, NIEHS
John Phillips, Georgia International
Terry Pierson, RTI International
Sunil Rajgopal Prasad, MResult
Asif Rashid, DS Technologies
Caroline Ridley, US EPA
Leon Rosentsvit, Technion IIT
Marianna Rosentsvit, NIEHS
Risa Sayre, US EPA
Kate Scholfield, US EPA
Frederic Seidler, Duke
Joseph Shaw, Indiana
Mina Shehee, NC DHHS
Shanshan Shi, Duke
Susan Simmons, NC State
Linda Smail, Zayed University
Raquel Silva, US EPA
Solomon Tamkabari, Rivers State College
Michele Taylor US EPA
Shane Thacker, US EPA
Kimberly Thigpen Tart, NIEHS
Jacob Traverse, Triangle Global Health Consortium
Natalia Vinas, ERDC
Jimmy Washington, NIEHS
James Weaver, US EPA
Leah Wehmas, US EPA
Paul Whaley, Lancaster University UK
Emily Woolard, US EPA
Ya Xue, Infinia ML
Chis Yoo
Hong Zu, NIEHS
Hal Zenick, US EPA
Yongjie Zhou, US FDA


Michelle Angrish, US Environmental Protection Agency
Martin Armes, Research Triangle Environmental Health Collaborative
Scott Auerbach, National Institute of Environmental Health Sciences
Maureen Avakian, MDB, Inc.
Mark Borsuk, Duke University
David Brown, Research Triangle Environmental Health Collaborative
Kenda Freeman, MDB, Inc.
Stephanie Holmgren, National Institute of Environmental Health Sciences
Jackie MacDonald Gibson, UNC-Chapel Hill/RTI International
Michael Overcash, Environmental Genome Initiative
Terrence Pierson, RTI International
Michele Taylor, US Environmental Protection Agency
Kimberly Thigpen Tart, National Institute of Environmental Health Sciences
Hal Zenick (Retired), US Environmental Protection Agency

Updated as event nears

Work Groups

Hotel Information

Contact Martin Armes for details