Confluence Retirement

In an effort to consolidate USGS hosted Wikis, myUSGS’ Confluence service is scheduled for retirement on January 27th, 2023. The official USGS Wiki and collaboration space is now SharePoint. Please migrate existing spaces and content to the SharePoint platform and remove it from Confluence at your earliest convenience. If you need any additional information or have any concerns about this change, please contact Thank you for your prompt attention to this matter.
Skip to end of metadata
Go to start of metadata

CDI Monthly Meeting - 20180214

The Community for Data Integration (CDI) meetings are held the 2nd Wednesday of each month from 11:00 a.m. to 12:30 p.m. Eastern Time.

Meeting Recording

Meeting recordings are available to CDI Members approximately 24 hours after the completion of the meeting. Please login to view the recording. If you would like to become a member of CDI, please email

During the call, you can ask and up-vote questions at, event code #3991.

Agenda (in Eastern time)

Presentation slides for the opening and announcements are available publicly. Other slides are accessible when logged in as a CDI member.

11:00a Intro to FAIR Data and what you can do about it [PDF] 

11:10a Welcome 

11:15a Working Group Announcements [PDF]

11:25a  Reproducible Notebook Series - Using Python to Bring Geophysical Data to the Surface - Kyle Enns and Cristiana Falvo 

11:40a  Semantic web for sustainability: revolutionizing how we write, find, link and reuse data and modelsFerdinando Villa

12:00p  Panel to discuss semantic web and the USGS - Ken Bagstad, Dalia Varanka, Julia Moriarty, Leslie Hsu

12:30p  Adjourn


Presentation: Slides are available to CDI Members. Please login to download the slides. If you would like to become a member of CDI, please email


Reproducible Notebook Series - Using Python to Bring Geophysical Data to the Surface

Kyle Enns and Cristiana Falvo (USGS)

Abstract: USGS has a storied history of being in the forefront of science technology, developing novel hardware and software solutions to best address scientific needs. However, preserving legacy USGS data derived from those innovative tools can present significant challenges, particularly using traditional, manual preservation methods. For example, preserving a single, early generation USGS magnetotelluric data file in a modern, open-format is extremely complicated, and therefore time consuming and costly, for a human to prepare. USGS is believed to be in possession of tens of thousands of legacy magnetotelluric data files. However, in many fields of science and industry, software is rapidly replacing humans for complicated, labor intensive tasks. Therefore, to study the potential broad-scale application of software for USGS legacy data preservation, the USGS Data at Risk project partnered with the USGS Crustal Geophysics and Geochemistry Science Center to develop and publish a Python workbook that batch processes magnetotelluric data in EDI format, ready for USGS data release. This presentation will provide an overview of preservation project, the software mechanics of processing magnetotelluric data, and the potential of a generalized Python script to batch process all USGS magnetotelluric files.

Semantic web for sustainability: revolutionizing how we write, find, link and reuse data and models 

Ferdinando Villa, Ph.D. (Basque Centre for Climate Change (BC3); IKERBASQUE, Basque Foundation for Science)

In the digital age, we need scientific data and models to be FAIR - Findable, Accessible, Interoperable, and Reusable - to help individuals, businesses, and governments make informed decisions. A fully connected information landscape using open, safe, accurate “Wikipedia-like” sharing and linking of data and models can enable data-intensive science, management and governance on a scale yet unimagined. In the last years, the ARIES (ARtificial Intelligence for Ecosystem Service [1]) platform has become a first complete implementation of a semantically integrated modeling technology, automatically assembling the ecosystem service models that offer the most context-accurate view of each human-natural systems under investigation. I will discuss a practical semantic integration approach by example, illustrating the four main pillars of innovation implemented in the k.LAB software: 

    1. Semantics, addressing all the “W’s of information – whatwherewhenwhy, and how -with languages and tools that make scientific observations easier to describe, understand, and use, while defining a worldview geared toward describing scientific phenomena and problems related to Earth and its natural and human inhabitants.
    2. Open, linkable data through understanding of – and agreement on – the nature of scientific information, including, but not limited to, adopting shared vocabularies and open standards to publish semantically annotated data as first-class research objects, so that they can be found online, read and understood by computers and humans alike.
    3. Open, linkable models, which, just like data, are seen as ways to make scientific observations. Powered by  semantics, artificial intelligence can transparently match the right data and models to the chosen time, place, problem, and scale, transferring much of the complexity of building and running models to machines, with substantial advantages for science and decision making.
    4. Software infrastructure, providing tools and interfaces for end users, modelers, and network administrators, aimed at simplifying the tasks of semantically describing, coding, and distributing data and models as much as possible.

Lastly, I will give a preview of the Integrated Modelling Partnership [2], bringing together institutions that will contribute to designing and building a fully integrated information landscape for the science of the future. 


Bio: Ferdinando Villa is a theoretical ecologist who had a long parallel career as a scientific software designer and engineer. After working in many fields of base and applied ecology, he discovered the joys and the pains of interdisciplinary research during a 15-year stint in ecological economics at the universities of Maryland and Vermont. After moving back to Europe for a tenured position in 2010, he still finds it a challenge – and a responsibility – to maintain scientific depth unaltered in face of the greatly increased breadth required by modern science. His research sits at the multi-faceted interface of linguistics, computer science, social science, ecology and economics, concentrating on artificial intelligence approaches to assisting environmental decision making and natural system assessment and valuation. In 2017, he and his team founded the Integrated Modelling Partnership, joining institutions that are contributing to designing and building a fully integrated information landscape for science, using the technologies pioneered during his career. He has been the recipient of major research grants from the US National Science Foundation, the European Union, The UK NERC and other institutions, governments and NGOs, all aiming to contribute to the science of coupled natural/human systems and to build effective technologies for decision makers that take sound science and “democratize” it by putting it at the fingertips of decision makers worldwide.


A Participant Report is available to CDI Members. Please login to download the report. If you would like to become a member of CDI, please email


  1. It looks like the recording didn't get linked to their respective .arf or .mp4 download files.

    1. They are posted now!

  2. I was just looking for it also, and the slides please!