Data, Processes, and Liberation

Stephanie Yeung

Earlier this month, Riffyn CEO Tim Gardner had the honor of joining a panel during Lab of the Future LIVE — a global online congress that brings together leading life science companies and solution providers that are building the lab of the future.

Gardner shared the virtual stage with Mark Wall, Director of R&D Operations and Site Management at BASF and Andrew Giessel, Associate Director of Data Science and AI at Moderna, for a panel discussion on “Liberating your bench scientists through AI.” Their conversation centered on data collection and structuring data for AI, and how maximizing both is critical for unharnessing the full discovery potential of your data.

A common theme that the panelists returned to time and again was harnessing data efficiently to drive insight. And, according to both Gardner and Giessel, the best way to harness your data is to ensure you have good data at the very beginning.

But how? What tools should one use and what are the considerations?

Digital Transformation = Digital Liberation

"There is no perfect one single off-the-shelf product that is going to make the difference," as Wall discussed how his team went about implementing digital tools for BASF's enzyme analysis and management. "For us, in terms of really liberating the bench scientists, what we need to do is to take some of the best blendings of off-the-shelf products as well as custom products."

You can preview Wall's talk "Digital Transformation = Digital Liberation" below or watch the full presentation here.

Digital Transformation = Digital Liberation

Watch The Full Reply

In his talk, Wall shared a framework for approaching digital transformation to liberate bench scientists:

  1. Flexibility: As digital biology continues to grow, we need flexibility so users are not tied to a single workflow.
  2. Data governance: Establish master data and keywords alongside structured data — capturing the experiments and their context along with the data — for improved data quality hint: Sharepoint folders organized in no particular structure is not the way to go.
  3. Streamlined user experience to make it efficient and interesting for bench scientists.
  4. Automated "drudgery", such as any copy-and-paste operations.
  5. Structured data for data scientists so they can get to analysis right away.

Watch the on-demand recording of this Lab of The Future presentation here.

Designing the system around your use of data

“Think about how you’re going to use your data before you architect your data infrastructure,” emphasized Gardner when asked which one thing he wanted audience members to walk away with. “If you can’t use it, what’s the point of collecting it?”

It’s not as simple as it sounds. Unlike engineering and technical disciplines, the sciences still largely operate manually. Experimental details and data are stored in paper notebooks, Excel spreadsheets, and SharePoint folders. In general, there is no efficient way of naming and storing data in an organized manner cohesive to each industry. And the entire process is subject to human error during transfer to digital format or just simply due to multiple people working off of different versions of the “same” document.
Why do we still operate so archaically? The reason, said Gardner, is that data systems are largely not designed to make data usable. Instead, they’re designed to store data — but stored data isn’t necessarily accessible data or easy to use data.

What science needs is a system designed from the start to make data usable and in a form ready for machine learning at all times.

Riffyn Nexus: the Process Data System

The industry needs a rich data environment, not just a database. It was this need that drove Gardner to found his company Riffyn and the company’s flagship Process Data System, Riffyn Nexus, It’s a process-centric system because, says Gardner, the context under which your data are collected is key.
It’s a cloud platform that enables scientific teams to iteratively design their processes, analyze and visualize their data using a number of tools (including JMP, Spotfire, Python, and R), and access the same data as one another every time. Material and samples are consistently named due to the Riffyn ontology, and versioning enables scientists to see and use historical data to inform better process designs and production decisions. And Riffyn Nexus automatically formats and contextualizes every single data point you collect so your data are always ready for machine learning and deep discovery.