The Community for Data Integration (CDI) meetings are held the 2nd Wednesday of each month from 11:00 a.m. to 12:30 p.m. Eastern Time.
Recordings and slides are available to CDI Members approximately 24 hours after the completion of the meeting. Please log in to view the meeting resources. If you would like to become a member of CDI, join at https://listserv.usgs.gov/mailman/listinfo/cdi-all.
11:00a Opening and Welcome - Kevin Gallagher - Associate Director for Core Science Systems and Tim Quinn - Office of Enterprise Information Chief
11:15a Working Group Announcements CDI_20191009_OpeningSlides.pdf
11:25a Packaging data and software: Jupyter as a publication platform - Keith Maull and Matthew Mayernik, NCAR
11:55a DevOps role in data integration and delivery: Use of Gitlab CI/CD, AWS Cloud Development Kit (CDK) and Elastic Container Service (ECS) at NGTOC - David Hughes and Rob Djurasaj, USGS
Packaging data and software: Jupyter as a publication platform
Digital science produces many kinds of research products and resources, including data, software, documents, workflows, etc. Managing the relationships that exist among and between such resources is a central requirement for current data systems. This paper presents projects within in the National Center for Atmospheric Research (NCAR) Library that are investigating using Jupyter tools to enable effective management and publication of scientific data and related resources. The goal of the talk is to stimulate discussion on optimal approaches for identifying, validating, characterizing, and preserving scientific information and tools.
Keith Maull is a Software Engineer in the NCAR Library. He leads research and development projects within the Library related to bibliometrics, scholarly metrics, and software management. He works on developing strategies and techniques for the future of research traceability and reproducibility through computational narratives. Keith is also the computational mentor lead for the UCAR Significant Opportunities in Atmospheric Research and Science (SOARS) educational program, in which capacity he conducts workshops and professional development seminars on computational thinking, Python, and reproducible research. He completed his PhD in computer science from the University of Colorado-Boulder.
Matt Mayernik is a Project Scientist and Research Data Services Specialist in the NCAR Library. He leads NCAR Library research projects and operational services related to research data curation. He is the current chair of the NCAR Data Stewardship Engineering Team, which provides coordination and technical systems for data curation across NCAR, and is a member of the Board on Data Stewardship within the American Meteorological Society. He received his Ph.D. from the UCLA Department of Information Studies.
DevOps role in data integration and delivery: Use of Gitlab CI/CD, AWS Cloud Development Kit (CDK) and Elastic Container Service (ECS) at NGTOC - David Hughes and Rob Djurasaj, USGS
An overview of multiple Infrastructure as Code (IaC) tools followed by a deep-dive on how to setup a Gitlab Runner to perform GitOps functions including IaC deployments against multiple AWS environments. Demo of NGTOC's approach at using best practices for rapid development and continuous deployment of immutable infrastructure(s).
Robert Djurasaj is the Cloud Architect (detail) and Delivery Project Manager at National Geospatial Technical Operation Center (Denver) , where he is managing a team responsible for serving geospatial data and services to both public and internal users. He has been a leader in the Center and in USGS in transitioning from an on-premise to a cloud-based IT architecture. Currently he is working closely with NGTOC teams to enhance and promote the use of AWS cloud.
David Hughes is the Science Systems Development Section Chief at the National Geospatial Technical Operations Center (NGTOC)