CDI Software Development Cluster

Meeting Notes

February 28th, 2019 @ 3:30PM ET / 1:30PM MT

Topic: Cloud and Big Data in the Cloud; an Open Session Discussion
 

Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/696628840

Or iPhone one-tap :

US: +16699006833,,696628840# or +14087403766,,696628840#

Or Telephone:

Dial(for higher quality, dial a number based on your current location):

US: +1 669 900 6833 or +1 408 740 3766 or +1 646 876 9923

Meeting ID: 696 628 840

Note that we have switched from GSTalk to Zoom for the time being, due to a number of usability/compatibility issues with the GSTalk platform.

 

Meeting Notes in Google Drive: Shared Google Drive Folder:

https://docs.google.com/document/d/1tC4Pmmhax_CTL2-wsjKlmqBA8DziRfelvJgbLKWPD5I/edit?usp=sharing

 

Attendees

Name

Email (if you are new)

Michelle Guy

mguy@usgs.gov

Travis Harrison

tharrison@usgs.gov

Rob Miller

rfmiller@usgs.gov

Steven Predmore

spredmor@usgs.gov

Cassandra Ladino

ccladino@usgs.gov

Elizabeth McCartney

emccartney@usgs.gov

Colin Talbert

talbertc@usgs.gov

Courtney Owens

clowens@usgs.gov

Andy LaMotte

alamotte@usgs.gov

Jeanne Jones

jmjones@usgs.gov

Eric Martinez

emartinez@usgs.gov

Hans Vraga (late)

hvraga@usgs.gov

Sam Pecoraro (late)

specoraro@usgs.gov

Carl Schroedl

cschroedl@usgs.gov

Leslie Hsu (late)

lhsu@usgs.gov

Drew Ignizio

dignizio@usgs.gov

Please take the quick Sli.do poll... https://app.sli.do/event/vhjskdfk

 

 

 


Agenda

     Welcome and announcements

     Please fill in name and email in the attendees table

     We are still always looking for topics, and your input and participation!

     We have created a form for submitting presentation proposals for future Software Dev Cluster meetings

     https://docs.google.com/forms/d/e/1FAIpQLSccsoCmFH4aT1OQNKaMDG7-ngIAlyGgmqSRQwJc_uYFf_tVUQ/viewform

    CDI bison connect google calendar of all the collaboration area meetings and events - name is “GS CDI” owner is gs_cdi@usgs.gov

      At least one session on cloud - “ Let’s talk cloud

      Topic for next month is this software dev community to come up with session proposal(s)

    Cloud and Big Data - Cassandra intro

    Sli.do poll - Have you done anything in the cloud

    Get into the cloud

   Check out youtube videos

   Online Training - learning tree, cloudera

   Sign up for a CHS AWS Sandbox

    Information about the Sandbox

    AWS infinite products available, but CHS has narrowed list of what they offer, overview provided

    S3 buckets - fancy folder where data files get a URL, easy to use

    Data Lake example how it has changed over the past few years, and now DocumentDB

    Data types mapped to technology examples (Big Data slide)

    Adding structure to unstructed data with things like Pig and Hive

    Tex search with elasticsearch (clusters) and lucene

    General information tends to be business intelligence oriented

    Possiblilities

   What if AWS DocumentDB applied to scientific data (e.g. released data in sciencebase)?

   Graph DB’s?

     Open mic

     Announcements?

     Questions?

     Lessons learned?

     Fun projects under way?

     Next Month: ?

     Coordinate on proposing CDI Session Topics [1] [2]

       More cloud topics this summer

       CHS could present on something

 

Discussion/Notes

      Apache Spark (Big Data Space)

https://databricks.com/try-databricks
https://www.cloudera.com/products/data-science-and-engineering/data-science-workbench.html

 

      Jeanne looking at graph DB’s and thinking about how they could be used

      Get info on usgs chs sandbox https://support.chs.usgs.gov/display/CHSKB/Help+Center

    Sign up for a CHS AWS Sandbox

    Information about the Sandbox

      Collaborators space

      Services are more limited in sandbox

      Sandbox is wiped clean quarterly

      Sandbox is shutdown nightly (cost savings)

      Simple application can be an EC2 instance and a DB, doesn’t have to be complex suite of services

      Serverless? Carl - water mission area in early days of serverless, manually testing, exploring lambda; Standard deployment pipeline into ECS

      Jenkins instance running for CHS customers? How to access? Courtney - CHS is not running Jenkins anymore, most customers run their own, if there is enough customer interest then can explore setting it up, CHS has moved to other solutions/technologies (AWS Source Catalog)

      CHS user customer group meets monthly - topics include containers, ECS, openshift, next month (March 20th)  is Tableau, and they are recorded: https://support.chs.usgs.gov/x/IgPv

      Email clowens@usgs.gov if you are interested in attending and would like to be added to the calendar

 


[1] Just a note that after March 1 we will be trying to organize what has come in with hopes to have a draft agenda in early March!

[2] Ok, we might try to help out at our meeting next month and make some structure and organization out of the software suggestions. :)