April 8, 2020: CDI Funded Projects
The Community for Data Integration (CDI) meetings are held the 2nd Wednesday of each month from 11:00 a.m. to 12:30 p.m. Eastern Time.
Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/338586400
A password is required this month. Check for an email from firstname.lastname@example.org on 4/6/20 for the password, or email email@example.com.
Dial(for higher quality, dial a number based on your current location):
US: +1 669 900 6833 or +1 408 740 3766 or +1 646 876 9923
Meeting ID: 338 586 400
Or iPhone one-tap :
US: +16699006833,,338586400# or +14087403766,,338586400#
International numbers available: https://zoom.us/zoomconference?m=Dfpb2Rwy-790IFvpEU0Xa-6z3Gex-mO8
Meeting Recording and Slides
Recordings and slides are available to CDI Members approximately 24 hours after the completion of the meeting.
Log in to view the meeting resources. If you would like to become a member of CDI, join at https://listserv.usgs.gov/mailman/listinfo/cdi-all.
During the call, you can ask and up-vote questions at slido.com, event code #23331.
Agenda (in Eastern time)
11:00 am Welcome and Opening Announcements - Virtual work and collaboration
11:20 am Collaboration Area Announcements
11:30 am Open-source and open-workflow Climate Scenarios Toolbox for adaptation planning - Aparna Bamzai, USGS
11:45 am Develop Cloud Computing Capability at Streamgages using Amazon Web Services GreenGrass IoT Framework for Camera Image Velocity Gaging - Frank Engel, USGS
12:00 pm Establishing standards and integrating environmental DNA (eDNA) data into the USGS Nonindigenous Aquatic Species database - Jason Ferrante, USGS
12:30 pm Adjourn
- Remote meetings resource: https://about.gitlab.com/company/culture/all-remote/meetings/
- Since the last call, we have added more security to our Zoom meetings, adding a password to Zoom calls.
- There has been an increase in virtual meeting attendees in the last couple weeks.
- ESIP Collaboration Areas Highlights Webinar on April 22: https://www.esipfed.org/webinars
- To join any of the CDI collaboration areas, see https://my.usgs.gov/confluence/x/JaapJg
- Kevin Gallagher
- The CDI is more important than ever in maintaining connection and communication.
- Virtual collaboration
- Do you have a "virtual water cooler"? Microsoft Teams and the CDI wiki are possible places for these kinds of conversations.
- Share notes and highlights after virtual meetings so others can benefit from your activity. CDI collaboration areas are great for these kinds of notes.
- Share your tips, tricks and ideas for working virtually with the CDI.
- Tim Quinn
- CDI is "collaboration on a massive scale", and very important in this time.
- Feedback from CDI has been passed onto the EarthMAP team and allowed the team to identify what aspects of EarthMAP are most exciting and most confusing.
- EarthMAP update from Sky Bristol
- Blog post (link for USGS employees)
- Intranet page (link for USGS employees)
- MS Team (link for USGS employees)
- Announcements from Collaboration Areas (see slides for full details)
- Town hall meeting: April 15th, Testing Usability with Users
- Resource review: May 20, Usability and Building Trust
- Have usability questions? Post them at: https://my.usgs.gov/confluence/x/yZCpJg Interested in being a usability tester? Sign up at: https://my.usgs.gov/confluence/x/ZMmpJg Want to stay in touch? Join Listserv via: https://listserv.usgs.gov/mailman/listinfo/cdi-usability
- Semantic Web
- Paper discussion, April 9
- "Best Practices for Implementing FAIR Vocabularies and Ontologies on the Web" by Daniel Garijo and María Poveda Villalón: https://arxiv.org/pdf/2003.13084.pdf
- 2020 SWWG Meetings
- Metadata Reviewers
- Last meeting, April 6
- Next meeting, May 4
- Meetings of the Metadata Reviewers Community
- Tech Stack
- Next meeting, April 9, "Unidata Science Gateway" https://science-gateway.unidata.ucar.edu/ http://wiki.esipfed.org/index.php/Interoperability_and_Technology/Tech_Dive_Webinar_Series#9_April_2020:_.22Unidata_Science_Gateway.22_Julien_Chastang
- Data Management
- Next event, April 13, Upcoming changes to the Science Data Catalog with Lisa Zolly
- Last event, March 9, Value Propositions with Science Gateways
- Software Development
- Next event, April 23
- Open Innovation
- April 17, COVID-19 Open Innovation Efforts: https://my.usgs.gov/confluence/display/cdi/COVID-19+Open+Innovation+Efforts
- The Opportunity Project (TOP) – Earth Sprint (Problem Statement Due Friday, April 10 – email me at firstname.lastname@example.org if you would like to help): https://opportunity.census.gov/sprints/
- TOP Earth Sprint Roundtable Notes: https://docs.google.com/document/d/1UE8cMjDL2_aJpwShHv7gn1hThrvpTQC9K5uBQ0zXadA/edit?usp=sharing
- FEMA PrepTalk on "Crowdsourcing & Citizen Science as Force Multipliers for Emergency Management" by Sophia Liu: https://www.fema.gov/preptalks
- Citizen Science Association Webinar: https://www.citizenscience.org/events/webinars
- Citizen Science Association COVID-19 Resources: https://www.citizenscience.org/covid-19
- Next meeting, April 16, Human-Centered Design Thinking with Impact360 Alliance (part 3)
- Risk Community of Practice Community Survey: https://tinyurl.com/vp3xla4
- Risk page: https://listserv.usgs.gov/mailman/listinfo/cdi-risk
- Annual meeting was March 17-18; ICEMM CDI website has all recordings here: Interagency Collaborative for Environmental Modeling and Monitoring
- Open-source and open-workflow Climate Scenarios Toolbox for adaptation planning: Aparna Bamzai-Dodson
- Link to website: https://www.earthdatascience.org/cst/index.html
- Scenario planning - a way to consider the range of possible outcomes; 3-5 plausible divergent scenarios. Managers and scientists can use this information for adaptation strategies.
- The Climate Scenarios Toolbox is attempting to take the pain out of working with climate data
- The Toolbox is open and usable, allowing other users to contribute open code. The Toolbox hopes to do the following:
- lower the barrier to entry
- automate common tasks
- reduce the potential for errors
- empower a larger user community
- The link above includes a getting started guide for the Toolbox.
- There is extra support for the National Park Service, as NPS was a partner for this project.
- Engaging CDI
- Install and use the Toolbox
- Provide feedback on issues/features
- Contribute to the package
- Develop Cloud Computing Capability at Streamgages using Amazon Web Services GreenGrass IoT Framework for Camera Image Velocity Gaging: Frank Engel
- Gaging (measuring water quanitity)
- Sometimes we can't measure
- flashy regimes
- indirect (post flood) methods aren't cheap
- How do we get past these issues?
- non-contact methods
- imagery combined with software - gets complicated; requires training; and some subjectivity is involved
- want to automate this process and take some of the pain out of it
- CHS/AWS IoT Cloud Processing Goal
- First required building a cloud infrastructure
- Auto-provisioning to the cloud
- MQTT Schema (in progress)
- Generating global actions (see something, do something)
- Initial time-lapse video Lambdas (for SSTL)
- Lessons learned
- Cloud computing knowledge takes a lot of work to acquire
- A lot of hands in the cookie jar
- In the short term, it can be difficult to sort through the differing needs of stakeholders
- Establishing standards and integrating environmental DNA (eDNA) data into the USGS Nonindigenous Aquatic Species database: Jason Ferrante
- eDNA is genetic material released by an organism into its environment (skin, blood, saliva, feces into surrounding air, water, soil, etc.).
- Why add a data layer to the NAS database specifically for eDNA?
- Want to combine the traditional specimen sightings and eDNA detections for a more complete distribution records to improve response time to new invasions.
- Aquatic invasive species data specifically are species of interest
- Need to establish strong community standards that will allow high-quality data that can be validated.
- What did we do?
- Experimental Standards
- eDNA literature review
- establish standard criteria regarding sampling design and collection, laboratory processing, and data analysis
- Stakeholder Backing
- Reviewing criteria among stakeholders
- Input by eDNA community of practice
- pre-submission form to vet data before it is included
- Teleconferences to gain consensus (ongoing process)
- Produce a white paper
- Integration into NAS
- Community standards
- Web submission form/template
- Prototype web viewer (map)
- Pre-submission survey
- Two blocks of questions, some that will require a "yes" in order to move forward, some that will vet the data better
- Quick start guide for the database became a need during the feedback process
- See slides or recording for mock-up of map view
- Expected challenges:
- Getting to consensus on submission form
- Staying organized and keeping lines of communication open
- Meeting the needs of managers and researchers (getting feedback)
- town hall style meetings to present ideas and garner feedback
- if you're interested, it will be Monday, April 13 - contact West Daniel if you'd like to attend
- Take aways/follow up
- Networking is very important. Use existing infrastructures (such as CDI!); Teams is also working very well
- Within the CDI group, many are looking for help developing new tools which use eDNA data. Working on a manuscript that provides insight about the process
- Based on recent USGS guidance, will this call be moved off of Zoom to Teams?
- Leslie: We are testing external participation now and will keep the CDI informed of tech choice. Anyone interested in testing or discussing further, get in touch with me!
- Would you be able to see if you have any "surprising" new users as a result of the tool, or do you have ideas of how to learn if you do?
- The package is not officially released on GitHub but we are working on it, and there will be a publication in Journal of Open Source Software. Hope we see people fork code, can incorporate user modifications back into the main branch. Hoping the user community picks this up and makes it into a bigger and better toolbox.
- What was your process for identifying your main users and their needs?
- Our center is part of a USGS network meant to work with natural resource management partners to help understand climate adaptation science. Work quite extensively with Fish and Wildlife Service and National Park Service over 7 years on supporting their science needs. Saw inefficiencies in the workflow. Commonalities in data requested, but starting from the ground up whenever having to provide it. Anyone doing research across continental US can use it, so hoping to expand past initial stakeholders
- Is the climate reanalysis data included so that historical climate (and weather) can be downloaded too?
- Yes, it can do historical and future comparisons.
- Could you say more about the Journal of Open Source Software?
- Open review process; way to release new tools that are allowing people access to new data. Write-ups for publication are pretty short; description of the package/software, what problem it solves, and how you are contributing (not doing something that is already done). Goes through peer review process and is released as a publication. Nice way to get the tool out to a broader community.
- Is it possible to include a vignette that uses the software in a JOSS publication?
- Will find that out and get back to you.
- Is "edge computing" the same as "data proximate analysis"? Can you explain it a bit more?
- No (don't know what data proximate analysis is). Edge computing is putting some sort of internet-connected device (like RaspberryPIs) at the edge, in the environment its meant to be in. Camera on the bank of a river. Phone in your pocket, etc.
- Can a user trigger the camera remotely? Or, based on water surface elevation?
- Yes. You can trigger the RaspberryPI to record video based on an external trigger. If it's an internet-connected PI, you can query another streamgage or other sensor and trigger that way. In the process of testing some other methods.
- How do you handle security with IoT? Are these RaspberryPIs protected since they have AWS credentials to upload the data? Is there a DMZ for video uploads?
- We are developing a stig for RaspberryPIs. Enforce how people set up modems in the field so its unreachable from the outside world.
- Are you using raspberri Pi camera?
- Yes. We use many different cameras. Rely on security web cameras that are NDAA compliant cameras.
- Can you elaborate a bit on Infrastructure as Code practices you're using for this IoT project? You mentioned that you had to create infrastructure first
- Foundry is the infrastructure. Code is Python-based.
- How many IoT RPis cameras are there. Do they provide constant video feed?
- Two cameras enabled IoT. 20-40 connected cameras not on IoT.
- Is that RPi stig available anywhere?
- What happens to the video stream after processing? Is it archived somwhere?
- You can do whatever you want with the video stream artifacts. That is up to the owner of the system.
- Can you comment about any "disagreements" that came up on the submission form when you got input from your community?
- Earliest iteration was just a CSV file, and more work would be done to vet data. Idea for pre-submission survey came up as input from the community. Lots of conversations about controls, that controls were in place, making sure we had questions that vetted the assay that was being run, making sure people were taking multiple samples from the field. Wanted to be as inclusive as possible, while maintaining a high level of quality.
- Is the eDNA data being used to validate/reinforce other species detection/occurrence data in NAS?
- Not specifically, but can work to the ability to do that. NAS does a lot of work to vet photos/data that come in.
- can you talk a bit more about spatial controls? links to NHD?
- This data layer is not going to be linked to anything, but this is one of the types of areas that might help to inform broader understanding of species distribution. We are interested in ways to pair eDNA with covariates.
- Have you looked at how your community standard will translate to the biological data standard: Darwin Core?
- Yes, they are similar. There's going to be a lot of overlap, and would like to make it overlap as much as possible.