CDI Software Development Cluster

Meeting Notes

February 28th, 2019 @ 3:30PM ET / 1:30PM MT

Topic: Cloud and Big Data in the Cloud; an Open Session Discussion

Join from PC, Mac, Linux, iOS or Android:

Or iPhone one-tap :

US: +16699006833,,696628840# or +14087403766,,696628840#

Or Telephone:

Dial(for higher quality, dial a number based on your current location):

US: +1 669 900 6833 or +1 408 740 3766 or +1 646 876 9923

Meeting ID: 696 628 840

Note that we have switched from GSTalk to Zoom for the time being, due to a number of usability/compatibility issues with the GSTalk platform.


Meeting Notes in Google Drive: Shared Google Drive Folder:




Email (if you are new)

Michelle Guy

Travis Harrison

Rob Miller

Steven Predmore

Cassandra Ladino

Elizabeth McCartney

Colin Talbert

Courtney Owens

Andy LaMotte

Jeanne Jones

Eric Martinez

Hans Vraga (late)

Sam Pecoraro (late)

Carl Schroedl

Leslie Hsu (late)

Drew Ignizio

Please take the quick poll...





     Welcome and announcements

     Please fill in name and email in the attendees table

     We are still always looking for topics, and your input and participation!

     We have created a form for submitting presentation proposals for future Software Dev Cluster meetings

    CDI bison connect google calendar of all the collaboration area meetings and events - name is “GS CDI” owner is

      At least one session on cloud - “ Let’s talk cloud

      Topic for next month is this software dev community to come up with session proposal(s)

    Cloud and Big Data - Cassandra intro poll - Have you done anything in the cloud

    Get into the cloud

   Check out youtube videos

   Online Training - learning tree, cloudera

   Sign up for a CHS AWS Sandbox

    Information about the Sandbox

    AWS infinite products available, but CHS has narrowed list of what they offer, overview provided

    S3 buckets - fancy folder where data files get a URL, easy to use

    Data Lake example how it has changed over the past few years, and now DocumentDB

    Data types mapped to technology examples (Big Data slide)

    Adding structure to unstructed data with things like Pig and Hive

    Tex search with elasticsearch (clusters) and lucene

    General information tends to be business intelligence oriented


   What if AWS DocumentDB applied to scientific data (e.g. released data in sciencebase)?

   Graph DB’s?

     Open mic



     Lessons learned?

     Fun projects under way?

     Next Month: ?

     Coordinate on proposing CDI Session Topics [1] [2]

       More cloud topics this summer

       CHS could present on something



      Apache Spark (Big Data Space)


      Jeanne looking at graph DB’s and thinking about how they could be used

      Get info on usgs chs sandbox

    Sign up for a CHS AWS Sandbox

    Information about the Sandbox

      Collaborators space

      Services are more limited in sandbox

      Sandbox is wiped clean quarterly

      Sandbox is shutdown nightly (cost savings)

      Simple application can be an EC2 instance and a DB, doesn’t have to be complex suite of services

      Serverless? Carl - water mission area in early days of serverless, manually testing, exploring lambda; Standard deployment pipeline into ECS

      Jenkins instance running for CHS customers? How to access? Courtney - CHS is not running Jenkins anymore, most customers run their own, if there is enough customer interest then can explore setting it up, CHS has moved to other solutions/technologies (AWS Source Catalog)

      CHS user customer group meets monthly - topics include containers, ECS, openshift, next month (March 20th)  is Tableau, and they are recorded:

      Email if you are interested in attending and would like to be added to the calendar


[1] Just a note that after March 1 we will be trying to organize what has come in with hopes to have a draft agenda in early March!

[2] Ok, we might try to help out at our meeting next month and make some structure and organization out of the software suggestions. :)