Confluence Retirement

Due to the feedback from stakeholders and our commitment to not adversely impact USGS science activities that Confluence supports, we are extending the migration deadline to January 2023.

In an effort to consolidate USGS hosted Wikis, myUSGS’ Confluence service is targeted for retirement. The official USGS Wiki and collaboration space is now SharePoint. Please migrate existing spaces and content to the SharePoint platform and remove it from Confluence at your earliest convenience. If you need any additional information or have any concerns about this change, please contact myusgs@usgs.gov. Thank you for your prompt attention to this matter.
Skip to end of metadata
Go to start of metadata

Note - the recording will be posted for those that weren't able to make the call.


Recording: [.mp4]

Intro

  1. Denise Akob reminded participants to fill out the Bioinformatics Survey and post any Data Release questions on the Forum for next month's call.
  2. To get added to the group email list, fill out the survey linked above, or email cdi@usgs.gov.

Alces Flight Offering at Cloud Hosting Solutions HPC/HTC Service

Courtney Owens, CHS

Slides

See Forum post on Existing High Performance Computing Bioinformatics Resources

  1. CHS Overview

  2. Alces Flight is a Linux-based software, you set it up in its own CHS environment

    1. "Alces Flight Compute provides a fully-featured, scalable High Performance Computing (HPC) environment for research and scientific computing. Compatible with both on-demand and spot instances, Flight rapidly delivers a whole HPC cluster, ready to go and complete with job scheduler and applications."

    2. Command line interface

    3. AWS S3

    4. 1 cluster for each scientist

    5. Spot Price Market - fraction of market rate

  3. Alces Gridware

    1. 1240 software applications

    2. 671 Bioinformatics applications

    3. 2 different repositories with different levels of "readiness"

      1. Main - thoroughly tested with latest release of Alces Flight

      2. Volatile - have not been thoroughly tested - may have more issues

    4. Gridware.alces-flight.com

    5. You can request for an application to be added by posting in the forum or the private support system.

    6. You can install all bioinformatics apps with one command

  4. There are three editions

    1. Solo Community - USGS CHS started off with this for beta testing but for better support, switched to...

    2. Solo Professional - started using at the beginning of February, get own USGS private support channel/forum - there are questions there from beta testers. There is a choice of HPC Scheduler. And there are Preinstalled software packs.

    3. Enterprise

  5. Initial Beta Testing Results

    1. Issues installing packages from the volatile repository, specifically Qiime

  6. Moving Forward

    1. We would like more bioinformatics scientists to test - contact Courtney at clowens@usgs.gov

    2. Testing is free

Q&A

  1. Q: If we have used other commercial package software, can they be added to Alces - do you need special licenses? 

    1. A: Courtney will look into that and let them know.

  2. Scott Cornman: this is entirely commandline as it is being tested. Will there be a GUI option? 

    1. A: They can’t have a GUI because of security restrictions. But Courtney will look into that. Q: the commercial packages from the previous question use GUIs, FYI.

  3. Denise Akob: How many beta testers so far? 

    1. A: Have had 5-6 total, 2 active now - Adam Mumford and Scott Cornman. Q: What software have they been using? Qiime and mother (?). Courtney is working with Janice Gordon and Jeff Falgout, they will be testing, maybe python.

  4. Sophia: Is testing in person or remotely? 

    1. A: we are able to do it remotely - go into the environment, install packages and run models on them.

  5. Denise: what is the time frame to open this up to people who have need for this computing environment? 

    1. Courtney: Enterprise launch at CDI workshop in May is a plan.

  6. Denise: how will billing work? Will scientists get billed individually? 

    1. A: we are figuring that out. Testing is currently funded by OEI Tim Quinn, it is going to depend on how cost structure will break down. Not individual scientist billing at first. Hopefully will have funding through the end of the fiscal year.

  7. Scott: Will the cost structure be roughly on the scale of Amazon rates? 

    1. A: Should be 20-30% lower than normal Amazon. Running models with Beta testing will help them figure out the breakdown.

  8. Scott: is there a minimum set of users? Because it will take awhile to grow the base. 

    1. A: That would be supervisor’s decision. We hope it will grow.

  9. Courtney: hope to have more releases with added features in the future. Based on the needs that they hear from customers. Big application registry with a lot of demand. CHS team is about 15 people. Moving to managed service route to help more customers.

  10. Denise: how often are softwares updated? Regularly? Does it need to be requested? 

    1. A: Every 3-4 months. Some software is more updated than others.

  11. Demo by Courtney

  12. Comment: new users will probably need help setting up their environment

  13. Alces can check the dependencies of volatile packages if you request it.

  14. On the documentation, there is the structure in place to install packages outside of the Gridware system - will that be enabled? 

    1. A: we think so. Just need to make sure permissions are okay for everyone.  Sometimes if you install numpy outside of qiime it is okay.

  15. Do you need to be well versed in Linux to use this? 

    1. A: We’re hoping it would be easy to go in, set permissions, install packages, but you need to be used to using commandline.

  16. Can this be done interactively, or only scheduling and batch mode? 

    1. A: Can be done interactively, don’t need to write a script to submit the job.

  17. Will you be able to run interactively when it is scaled up. 

    1. If you have autoscaling turned on, then you should be able to. I’ll check with Jeff.

  18. Comment on the related forum post if you have more questions, Courtney will plan to check on some questions and update the group.

Follow-up answers

From Courtney Owens, 3/1/17:

Hi all,

Thank you for a great discussion on Alces Flight.

Here are answers to the questions I was not able to fully answer on the call:

Will there be a GUI option with Alces Flight?

Jeff Carson and I have started this discussion with the rest of our team. We need to figure out if there are any work-a-rounds available for having a GUI that would still abide by necessary security constraints. We are not sure on the timeline for this and will keep you updated. 

Can commercial package software be added to Alces Flight?

We have not tested this yet, but believe it is possible to install commercial software packages on Alces Flight. We can verify this with Alces Flight if a user can provide us with a particular package that they are interested in. Furthermore, we would like to point out that the Alces Flight Platform is designed to be ephemeral. Because of this, we do not recommend installing commercial package software that you are not using in conjunction with software applications available in Alces Flight. We do not think that Alces Flight is the right tool for this particular use case and are hoping to have an offering that will accommodate this use case in the future.   

When you run interactive jobs, do compute nodes still autoscale?

An interactive job will only use whatever compute nodes are available to them. This means you will have to choose the type and number of nodes you want to run interactively. Our scaling policies will terminate compute nodes provisioned this way when they are not being used, but will not scale them up.

Please let me know if you have any additional questions or are interested in helping us beta test.

Thank you for your time,

Courtney

GeoHackathons and eDNA and Invasive Species Work

Sophia Liu

  1. Hacking is not just with a negative connotation!

  2. Goals

    1. Leverage crowdsourcing, citizen science, and civic hacking

    2. Promote free and open innovation skills

    3. Hack for Change and Hack Red Tape. hacking in a positive way.

    4. Not just shiny object at the end, but a socio-cultural, technological, organizational, political-policy advancement

    5. There can be Art and design aspects

  3. Sophia is organizing Date, Time, and Location - TBD

    1. Have one at DOI, one at Reston USGS, non-federal location

    2. Try to make it accessible via live streaming.

  4. Project Themes relevant to this call

    1. Critical Minerals and Resources

    2. Threats to Biodiversity

  5. Invasive Species project

    1. How to leverage remote sensing data sources and products to detect invasive species?

  6. eDNA - Sophia is working with Andrea Ostroff, JC Nelson, Denise Akob

    1. Ideahack - explore ways to create sustainable viz/collab tools

  7. Do you have comments, thoughts, questions, concerns?

    1. Who might be primary users of eDNA and etc.

    2. Denise: an option for the Bioinformatics CoP, we could talk through algorithms for bioinformatics at some sort of Hackathon.

    3. Scott: can you clarify what the structure of a GeoHackJam is? What does it look like? It is a single day? A: Keynote speakers and short presentations on the projects and tools, goals, roundrobin introductions, geohacking, summary to share outputs, progress, next steps, Mappy Hour. Each topic has a day devoted. Then final sprint and presentations of general hacks.

    4. Sophia: welcomes discussions on disciplinary aspect so she can learn more about the discipline. What datasets? Q: do you need location and nature of the sample, or do you need the detailed DNA results? Do you need more info? Ask Andrea Ostroff. If not published yet, that is understood. The more information available, then great.

  8. What interest in eDNA is there on the phone? A: A significant minority are either interested or practicing.

  9. From Sophia: 

    1. Here is more information the GeoHackJam                             

    2. Here is the Invasive Species project                                          

    3. Here is more info on the eDNA project

  10. Email sophialiu@usgs.gov

  11. There is further information on the Bioinformatics Forum post on GeoHackJam / GeoDataJam.

 

Attendees:

28 attendees, awaiting the WebEx report

 

  • No labels