On 23 November 2020, the CSCCE Data Science Special Interest Group (SIG) convened a meeting to discuss how to normalize talking about data. Julie Lowndes of Openscapes introduced the topic, providing an overview that is captured in full in the video archive below.
Normalizing talking about data
For scientific communities that rely on sharing data, and/or analyzing large datasets, there is often a language barrier that complicates standardizing processes: everyone talks about their data differently. Add to that a lack of clarity in communication and you end up with grad students and postdocs repeatedly reinventing the wheel to get their work done. In this presentation, Julie Lowndes shares her own experience, which ultimately led her to launch Openscapes. Openscapes offers “mentorship, training, coaching and community organizing centered around open data science [with a view to] help[ing] teams develop collaborative practices that are more reproducible, transparent, inclusive, and kind.” You can find out more on the Openscapes website.
Julie’s presentation inspired fascinating discussions in breakouts. Some of the key points that came up were:
- It’s hard to conceptualize and to articulate your needs if you don’t have the vocabulary. By creating and sharing nomenclature that everyone can adopt and understand, it becomes easy to see through the context of the specific research project and instead focus on the data stewardship that is standard across disciplines, e.g., renaming variables or entering metadata.
- There is a distinction between research questions and data questions. Your research question has never been asked before, but your data question may have been.
- As a community manager for a data-driven community, there are barriers to overcome to help researchers communicate effectively about their data:
- Many are self-taught and may carry with them a sense of shame or embarrassment that they “didn’t do something the right way.” This might also manifest in researchers not knowing how to ask the “right” questions.
- Data science is often talked about as a separate activity or discipline.
- The researchers in your community are diverse, with varying backgrounds and areas of expertise.
- Champions programs are one way of getting over these barriers, by creating trusted nodes with communities to help normalize talking interoperably about data.
- Open science practices add an extra layer of complexity, since many researchers will not want to share data until they have published their findings.
About the CSCCE Data Science SIG
The Data Science SIG is a space for community managers from data science, data science adjacent, and data science interested communities to gather and share activities, updates, and observations. We are especially interested in learning how cross-community information sharing and activities can raise up all of our communities.