The items below are from emails received containing information about the interviews people gave in order to review the draft exit survey.
Monday I had an opportunity to get comments on the Exit Survey from Jeff Eidenshink, a longtime scientist at EROS. Jeff retires this Friday so his 'context' for reviewing our form was spot on. He did not fill out the form, but provided comments for many sections. His thoughts are below. Note that he jumped around a bit so the comments are not necessarily in the order of our document.
o Too much redundancy
o Property and IT already do several of the items on the form
o Answering "Yes" should require some explanation, but the way the form is "Yes" could be it
o Questions are not answerable (related to the "Yes" comment)
o Mobile device was already turned in. [related to other devices that may have data on them]
o The length of the survey was way too long
o In response to the "Where are the electronic data located" question, His response could be on web pages, thumb drives, etc. Is that really useful?
o "Is the data labelled?" Aren't file names labels? Same question: If I answer "Yes", then what happens?
o This is more of a business practice checklist
o Who is "us" in the survey?
o Regarding "Data Preservation," Who is this questionnaire for?
o IT questions should not be on this form
o What if my HD is encrypted, do I need to give you the key?
o "Any personal devices with data" Well, I have a Landsat scene at home on my computer.
o Did not understand the use of "Anything else?" questions. What are you looking for?
o "Are there any systems involved in automated data access?" He would answer "Yes" and that would answer the question. He knows we want more than that, but the question could be answered with a yes or no only.
o We are assuming we have a good legacy to preserve.
o We can't release any dataset without metadata. Why are you asking me metadata questions?
o Often not clear what answer we are looking for.
o The EROS checklist [that involves records] is too brief, but this form is overboard.
o What is non-standard software?Aren't we supposed to not have any non-standard software loaded?
o The use of Yes/No formatted questions probably isn't asking what we really want to know.
I provided the draft exit survey to two researchers here at NOROCK. The respondents included my Center Director who has had a research career spanning decades and who can provide a manager's perspective. The second respondent has been with USGS for approximately 5 years and was previously a university researcher. Attached is thesurvey with his responses and comments.
While Jeff didn't complete the survey, he did respond to indicate he thought the survey was too long and that its length would be an impediment to completion. On this point, I personally feel that if it takes days or even weeks to complete the questionnaire, it is time well spent to capture the information associated with data collected over a career that may span 30+ years. I also believe this will seldom occur.
Rob's responses provide an example of what may be typical if the questionnaire is not administered. His comments regarding expectations, motivation, and time commitment are worth noting.
I think a major benefit of this survey, to NOROCK, is to show just how poorly we are meeting the incremental steps necessary for good data management. We are currently limited by a lack of clear policy, procedures, manpower, and infrastructure. I don't think we are an anomaly. This survey may prove more useful if administered several times during a career. In the future, I plan to use its final version as a benchmark of how well we are meeting the need for study plans, metadata, etc and hope the survey itself will one day be superfluous because the requested information has already been captured.
Thanks for the opportunity to participate.
Sorry, but I will be unable to attend tomorrow's meeting. I conducted an interview with Pierre Glynn of the National Research Program, and this is the feedback/interview observations from the interview. He also suggested that I speak to two other specific people, one who is newer to USGS and the other more "seasoned."
- Too many surveys in USGS now
- Make it clear as to why people need to do this
- Provide two options - one to complete the form on your own with the option to have someone walk you through it
- Basis+ is only used because it is required - they have their own process for tracking the progress on projects that is more descriptive
- The project contact list might be difficult for people who don't intend to hand their projects over - most people wouldn't shepherd someone's project after their colleague left
- License section should be adjusted to account for non-software licenses
- The forms seem to have question that could have multiple answers for different projects that wouldn't show connections for the answers as you move through the form
- Reference the locations/groups to talk to or email for questions/issues within the questionnaire - i.e. RLO, property, etc
- We need to add Google drive, website, and shared drives
- For Models and code, we don't ask the specifics of what project/program it is tied to
- Provide links to litigation hold database for verification
- Address FOIA requested information - in case there are appeals or repeat requests for this information
- Use this as an opportunity to teach and remind people of different requirements and where to go to more info or guidance
Also, I spoke to OBIS, the OEI group that creates the SharePoint forms, and they are willing to work with us on developing the form. We would just need to the form closer to where we want it to be to go that next step.
Ken Rice (Gainesville Florida Center Director):
I looked over the questionnaire and I think this will be very helpful as we have folks retire. I am trying to give scientists several months to archive their data and develop metadata prior to retirement...but this will help us surveywhere they left things. My only comment is that it may be helpful to link the projects to data rather than have separate questions...so you might have them describe a project, tell us about the data type and location, etc... then move on to another project.
Eric Strom (SC WSC Center Director):
Sorry in the delay, quite a busy few weeks. I think this is a good idea to try and preserve or information as people leave. Often things are somply lost when someone moves on. My only two comments would be to make the link to the USGS Survey Manual lead to a relevant topic, rather than the general Webpage, and also check the document throughout for "data" and use it as a plural rather than singular noun (data were, not data is).
Nice work putting this together.
Lori Anne Baer
Exit Survey Interview with Research Geologist Bob Thompson and Geographer Richard Pelltier
Bob has led many large projects, including the study of packrat ‘middens’ (up to 25,000 years old!) which contain parts of plants and other debris used to date climate change chronology, etc. Bob is not planning on retiring soon, so the discussion was more ‘general’ with comments about data, projects, etc.
Comments and Questions –
1) General comment – ‘when I first saw this document, I thought there is no way most scientists in this building would want to fill this out….especially if they are on their way out the door’. Document is too long and redundant. Discussion continued about how an ‘interview’ would be more appropriate when a person is leaving the USGS.
2) Define ‘Data’ – discussion about the amount of data (8 Terabytes) a student working on this project in Oregon is responsible for! New versions are being created quite often, so at what point is the ‘data’ worth recording as metadata, etc. Many times a ‘subset’ of a database is used to run analysis of data.
3) As far as metadata, this project team does not make a habit of collecting metadata as per FGDC Standards, but the printed or digital reports contain appropriate metadata… the source of the info used for datasets to perform analysis, etc.
4) Many datasets are ‘moving targets’….updates to climate change data and flora descriptions happen quite often and analysis needs to be redone.
5) Data Management plans – general discussion about how when you start a project, you have a plan in place as to where the data exists, where it will be stored, links to other sources of the data, etc. I don’t think many of the scientists who have been around for a while have ever taken or been offered a DM planning class per se.
6) Servers discussion – Most data located on local PC’s and external drives. This project tries to do their own backups. We won’t mention the word ‘Dropbox’ here!? General discussion about the lack of automated backup systems in this building. In some cases there are links to the data via other groups (ex. NOAA). DVD’s are shared among team members. Budget issues don’t always allow for the purchase of large capacity servers for the life of the project.
7) Software/Licenses – Non standard applications might be used to create a dataset or product, but they try to convert the results to formats readable by standard USGS software. (Ex. Use of proprietary package called PSColor, created around 1992 in Fortran language, still used to create diagram layout for Atlas publication! But they convert it to .pdf for final dissemination.)
8) Electronic Data – where is the data located? – answered ‘yes, all of the above’! …data located on computers, websites, external drives, personal computers.
9) Data Labeled? – Always issues with filenames…may be intuitive to one person, but not to another team member. Or 1 year later, you, the creator of the data, has no idea what was meant in the data file name!
10) Websites – yes, both internal to USGS and public websites have been created.
11) With 4 funded people on the project, if Bob were to leave tomorrow, someone knows all about the data!
Read Lori's interview notes.
In general I think some of the comments John got from his retiring colleague at EROS apply, but I also think that some of the comments reflect perhaps an old guard perspective that I think won't be common moving forward with other staff and are sort-of obsolete.
The biggest comment I agree with is the seeming redundancy in the form that I think speaks to a larger issue of the layout and in turn usability of the form with its design. There's overlap among sections and at present I'm not sure it should go live with the current layout (I know its still draft). Almost seems like a whiteboard exercise of the major topics and then subtopics might help in terms of grouping and identifying overlap. Some of the questions in sections now lack context that is needed from other sections or require information to make sense that has already been provided in other sections, so I can see how from a non-data-management-minded perspective the form might seem confusing and redundant.
I also think we should not avoid perceived IT-related questions or topics. For one that comment is dealing with personal definitions (e.g., "What does "IT" constitute?"; "Who is responsible for what parts of IT"?), etc that can vary widely. In our science center, IT has no involvement with working on cooperator projects, but they might set up a virtual server for a project to use (and once the out-of-the-box setup is done, any configurations beyond that are the responsibility of the project chief / technical lead), and as is probably common the IT unit here does not track project data / documentation. Data today are increasingly electronic, and thus require computer systems to access, and you're dealing with hardware / software / ITish things to work with those data. Perhaps just focusing on responsibility for things and roles relative to projects and overall cost center issues is better. I'm not convinced though that every cost center works from the same 'employee exit' playbook with questions that are asked by IT or other, and of course as is the case with why we are developing thissurvey for DM issues.
More comments in the document itself (see attachment -Science_data_exit_form_DRAFT_053114_TB.docx)
Here is my edit on the Intro
I asked two researchers here at the USGS Coastal and Marine Science Center in Woods Hole, MA., to review the exit form and provide their comments. Please find attached comments
received from oceanographer Brandy Armstrong and Kathy Scanlon, geologist comments listed below. I feel that both researchers offered some valid points.
Brandy's suggestion that the project portion of the form, should be in an outline format is well taken. Though, I'm not sure if it's because normally most all marine science data collections are in an outline format (COMPASS). Therefore, not viewed as uniquely specialized by myself as the WHSC data librarian, but more as the norm-perhaps not so for others?
The other reviewer Kathy, a research geologist for 35 years here at WHSC, with the geology of seafloor habitats as her area of interest. In recent years Kathy has worked primarily as an collaborator with outside vendors. In her comments, she asks a project question are we asking for only USGS contacts?
General Information: 'Please provide the current name and contact information ..." Comment: Do you want current or post-departure contact info or both?
Projects Contact and Relationships: '...current point of contact that is different from departing employee...' Comment: Does 'current point of contact' need to be at USGS? Might want to allow for more than one "current point of contact". May be different people cognizant of different aspects of a project.
Projects: 'What relationships, if any do "these have with each other...' Comment: Or with other projects?
Further comments: Suggests using a broader term than 'spin-off," and Are are there any related projects (making her proceeding comment obsolete).
Comment: Mention funding sources and/or outside funding sources, especially if project is ongoing, but could be useful for finished projects as well...
Documentation-Data Management Plans: 'Are there any manuals...Comment: If so, where are they located Same comment for the question about field books...If so where are they?
Publications: "Where are all the files...for each publication" Comment: Do you mean for the pubs in process? The files for published pubs are, er, published. Seems like you need a list of completed pubs, annotated to show relationships with projects, AND some questions about unfinished pubs.
Physical data heading. Comment: You probably should separate 'earth materials' from 'paper' data.
'Where are the data located? Comment: What about the condition of these data-e.g. have they been kept sealed? Have they been sub-sampled? Are they contaminated?
Comment: Storage requirements e.g. In need of refrigeration, hazardous, heavy...
'Field notebooks" Comment Again, what are they?
'Ongoing Data Access' Comment: What about access for outside collaborators? Co-op agreements?
Overall, management very excited by this form, and adopting a formal process to help identify data is very valuable for our science center. Below are a series of comments from the discussion.
We often don't get adequate notice on departing employees, sometimes as little as 2 weeks. This survey really requires a bit of effort that they may not have adequate time.
Encourage employees to prioritize documentation on the most valuable data assets.
Our team has no policy on how to document data upon departure.
A test implementation with recently departing or close to departing employees will be very valuable—I am trying to conduct this test, but having trouble finding time (both scientist and myself)
Our local Data management project will include language and funds to facilitate this type of interview in BASIS for FY15.
Recommend that we prioritize the form, focusing primarily on the ‘who/what/where/when/why/how’ of data, and de-emphasis some of the lesser important items (such as software/hardware).
Implementation of form may want to clearly highlight :
- must have info
- Less important info
- Least important information
We may want to document different scenarios depending on the expertise available to help the scientist with this form. Data-management rich science centers versus those lacking.
This could be another important role of a science-center level data manager.
Another question to add: Is data backed up regularly/where is your data?
Add name of supervisor to form
How to document emeriti or other continuing projects?
In project contacts & relationship, what point of contact is to be provided? Supervisor, manager, project chief, data steward?
Scientists will likely not know anything about DOI’s.
Expand details on field records & disposition schedules. What are they, and how to use them.
Recommends putting the publication section higher in order of questions.
Modify the IPDS question to, ‘Provide IPDS number’ for easy lookup.
Another way to organize is separate active/current data that may be in review versus legacy data.
The system/hardware and software may not be relevant to many users.
Include questions about if a server or website is maintained by the departing scientist, has a new POC been identified and trained for transition of responsibility?
Have the ‘Data’ section as the first section.
Physical samples could also include rock cuttings, oil or water samples, etc.