NSDC Data Science Flashcards – Data Science Ethics Card #3 – FAIR Principles Part 1

This NSDC Data Science Flashcards series will teach you about the importance of data ethics. This installment of the NSDC Data Science Flashcards series was created by Florence Hudson and Varalika Mahajan. Recordings were done by Lauren Close, Florence Hudson, and Emily Rothenberg. You can find these videos on the NEBDHub Youtube channel.

In the ever-expanding realm of data, ensuring that information is Findable, Accessible, Interoperable, and Reusable is more critical than ever. Enter the FAIR Principles. In 2016, the ‘FAIR Guiding Principles for Scientific Data Management and Stewardship” were born, aiming to improve the way we deal with digital assets.

These principles emphasize something intriguing – Machine-Actionability. Why? Because our data world is evolving rapidly, and we’re relying more on computational systems than ever before.

Let’s dive into the ‘F’ – Findable. It all starts here – finding data. For humans and machines, data should be easy to locate. Machine-readable metadata is key for automatic discovery. Data are Findable if:

  • Metadata are assigned a globally unique and persistent identifier.
  • The data are described with rich metadata.
  • Metadata include the identifier of the data they describe.
  • Metadata are registered or indexed in a searchable resource.

Now, the ‘A’ – Accessible. Once you’ve found your data, you need to know how to access it, possibly with authentication and authorization. Data are Accessible if:

  • Metadata are retrievable by their identifier using a standardized communications protocol.
  • The protocol is open, free, and universally implementable.
  • The protocol allows for authentication and authorization procedures.
  • Metadata remains accessible, even if the data are no longer available.

‘I’ stands for Interoperable. Data needs to work together. They must integrate with other data and play nice with applications and workflows. Data are interoperable if:

  • Metadata uses a formal, accessible, shared, and broadly applicable language.
  • (Meta)data use vocabularies that follow FAIR principles.
  • Metadata includes qualified references to other metadata.

Lastly, the ‘R’ – for reusable. The ultimate goal is to optimize data reuse. Data and metadata should be well-described to be replicated and combined in various contexts. Data are reusable if:

  • Metadata is richly described with accurate and relevant attributes.
  • Metadata are released with a clear and accessible data usage license.
  • Metadata are associated with detailed provenance.
  • Metadata meet domain-relevant community standards.

The FAIR Principles are the compass guiding us in the vast data landscape. They make data findable, accessible, interoperable, and reusable, while ensuring that computational systems can handle it efficiently.

As data evolves, remember these principles to navigate the data universe effectively and ethically.

Please follow along with the rest of the NSDC Data Science Flashcard series to learn more about data science ethics.