Confluence Retirement

Due to the feedback from stakeholders and our commitment to not adversely impact USGS science activities that Confluence supports, we are extending the migration deadline to January 2023.

In an effort to consolidate USGS hosted Wikis, myUSGS’ Confluence service is targeted for retirement. The official USGS Wiki and collaboration space is now SharePoint. Please migrate existing spaces and content to the SharePoint platform and remove it from Confluence at your earliest convenience. If you need any additional information or have any concerns about this change, please contact Thank you for your prompt attention to this matter.
Skip to end of metadata
Go to start of metadata

CDI Monthly Meeting - 20191009

The Community for Data Integration (CDI) meetings are held the 2nd Wednesday of each month from 11:00 a.m. to 12:30 p.m. Eastern Time.

Meeting Resources

Recordings and slides are available to CDI Members approximately 24 hours after the completion of the meeting. Please log in to view the meeting resources. If you would like to become a member of CDI, join at

Agenda (in Eastern time)

11:00a Opening and Welcome - Kevin Gallagher - Associate Director for Core Science Systems and Tim Quinn - Office of Enterprise Information Chief 

11:15a Working Group Announcements CDI_20191009_OpeningSlides.pdf

11:25a Packaging data and software: Jupyter as a publication platform - Keith Maull and Matthew Mayernik, NCAR

11:55a DevOps role in data integration and delivery: Use of Gitlab CI/CD, AWS Cloud Development Kit (CDK) and Elastic Container Service (ECS) at NGTOC - David Hughes and Rob Djurasaj, USGS 

12:30p  Adjourn


Packaging data and software: Jupyter as a publication platform

Digital science produces many kinds of research products and resources, including data, software, documents, workflows, etc. Managing the relationships that exist among and between such resources is a central requirement for current data systems. This paper presents projects within in the National Center for Atmospheric Research (NCAR) Library that are investigating using Jupyter tools to enable effective management and publication of scientific data and related resources. The goal of the talk is to stimulate discussion on optimal approaches for identifying, validating, characterizing, and preserving scientific information and tools.

Keith Maull is a Software Engineer in the NCAR Library. He leads research and development projects within the Library related to bibliometrics, scholarly metrics, and software management. He works on developing strategies and techniques for the future of research traceability and reproducibility through computational narratives. Keith is also the computational mentor lead for the UCAR Significant Opportunities in Atmospheric Research and Science (SOARS) educational program, in which capacity he conducts workshops and professional development seminars on computational thinking, Python, and reproducible research. He completed his PhD in computer science from the University of Colorado-Boulder.

Matt Mayernik is a Project Scientist and Research Data Services Specialist in the NCAR Library. He leads NCAR Library research projects and operational services related to research data curation. He is the current chair of the NCAR Data Stewardship Engineering Team, which provides coordination and technical systems for data curation across NCAR, and is a member of the Board on Data Stewardship within the American Meteorological Society. He received his Ph.D. from the UCLA Department of Information Studies.

DevOps role in data integration and delivery: Use of Gitlab CI/CD, AWS Cloud Development Kit (CDK) and Elastic Container Service (ECS) at NGTOC - David Hughes and Rob Djurasaj, USGS

An overview of multiple Infrastructure as Code (IaC) tools followed by a deep-dive on how to setup a Gitlab Runner to perform GitOps functions including IaC deployments against multiple AWS environments. Demo of NGTOC's approach at using best practices for rapid development and continuous deployment of immutable infrastructure(s).

Robert Djurasaj is the Cloud Architect (detail) and Delivery Project Manager at National Geospatial Technical Operation Center (Denver) , where he is managing a team responsible for serving geospatial data and services to both public and internal users. He has been a leader in the Center and in USGS in transitioning from an on-premise to a cloud-based IT architecture. Currently he is working closely with NGTOC teams to enhance and promote the use of AWS cloud.

David Hughes is the Science Systems Development Section Chief at the National Geospatial Technical Operations Center (NGTOC)