The Four Pillars of a Digital Ecosystem for R&D

John Conway

As scientific R&D organizations continue to invest in digital tools, why have so few been able to create a digital ecosystem of interoperable software, data and analytics? The answer to this lies in the complexity of the challenge and the historical inability of software tools to support it. We’re changing that. We’re helping leading organizations build a digital ecosystem through our cloud-based process data system and our scientific services team. The results are transformative. Through experience we’ve identified four essential pillars of a successful digital ecosystem: data, technology, process, and people.

Data. Contextualized and FAIR (Findable, Accessible, Interoperable, Reusable)

Technology. Extensible and interoperable

People. Embracing of change, incentivized and supported, view data as an asset.

Process. Defined, transparent, continuously improving

Some Communications History First

In earlier days human communication was simple and localized. A gradual evolution took place from rock scratches and cave paintings to ingenious long-range smoke signaling. Of course, this evolved to papyrus scrolls, the printing press, telegraph, telephones, multimedia, and now today's instantaneous real-time communication through smartphones, laptops, tablets, and computers. It only took ~30,000 years.

But communication in the scientific laboratory has not benefited from these advances as much as other fields. The scientific lab is more complex and detail oriented than typical work environments. Before the advent of the electronic laboratory environment, which really didn’t come into full adoption mode until the 2000s, the complexity of scientific communication was done face to face, on the phone, or through writing. The focus was always on the scientific experiment — the heart of the laboratory!

With the onset of the electronic lab, some of this interpersonal communication was replaced as scientists and researchers relied on electronic tools. However, like most evolutionary technology the initial electronic laboratory tools were not conducive to human intuition or interaction. They were poor at capturing the what, the why, and the how of human thought, ideation, and development. Thus, while they advanced the digitization of data, they also created a huge gap in capturing and communicating the most fundamental element of science – the experiment design itself.

The electronic laboratory has since evolved and integrated many newer digital tools, from molecular design to workflow automation to data warehouses and data lakes to advanced visualizations. However, these tools have never really recovered what was lost when scientists talked about their experiments with each other, shared ideas, considered new things to try, and changed their minds.

Riffyn set out five years ago to change that equation. Harnessing the latest real-time cloud technologies, we invented a new process-centric paradigm in data system design that places human and scientific intuition, or experiment ideation, as the central “hero” in the digital story.

Riffyn Nexus (Riffyn SDE), our Software-as-a-Service (SaaS) platform, captures experiment data directly into the context of well-documented scientific processes and workflows — which are the foundation of conducting and communicating good science. Riffyn Nexus also tracks and traces the flow of materials and ontology-based data through the entire scientific lifecycle, both within a single experiment and across months of experiments. It then delivers this data in a clean, standardized, vendor-independent format that can be consumed and analyzed by any person, software system, or programmatic interface.

In short, Riffyn makes data and processes FAIR (Findable, Accessible, Interoperable and Reusable) and delivers them in an open ecosystem friendly to people and machines.

A Digital Ecosystem for Science Demanded a New Approach

Prior to Riffyn, there did not exist a digital backbone that could perform multiple, essential functions: capture and digitize scientific processes, design the experiment, contextualize the data, and present all the combined information as model-quality data for advanced analytics. Instead, there were three primary approaches to scientific data: sample-centric (LIMS), experiment-centric (ELN), and document/data archiving (SDMS).

LIMS tend to be static and inflexible, and thus require serious configuration or customization in order to become a useful tool. In such environments, scientists faced huge obstacles to trying new ideas, barriers to experimenting with methodology — if they wanted to change something, they had to start over or ask for external software support to modify the data system. But who has time to wait for that? This rigidity becomes antagonistic to innovation.

On the other hand, ELNs which are usually seen as a more flexible solution, were originally developed to capture IP, not as a tool to support scientific communication and data analytics. In fact, some companies even locked scientists out of their own notebooks after they completed an experiment report. The data, it was believed, should just be archived for the lawyers because there was no long-term value beyond the experiment report. As such, early on there was little effort to make that data findable, reusable, or mineable for deeper learnings. Some ELNs have added this capability and RIffyn is integrating with them.

Most significantly ELNs and LIMs did not capture what may be the very most important information of all: the scientific process that generated all that data. Organizations might map the lab workflow in Visio or PowerPoint at the beginning of an IT project. But typically, that document gets filed away never to be seen again after project start. Moreover, the workflow is not the scientific process. The scientific process articulates all of the details that give meaning and interpretability to data, but that information is often just absent.

Mind you, we are not dismissing the significant and valuable contributions that LIMS, ELNs and SDMS systems have made to the digitization of science. They were all cutting edge at the time of their development and were essential in data and automation-intensive environments. But these original tools were not designed to solve the more challenging problems of distributed science using modern data analytics and real-time internet communications. They were the first leg of a journey towards an ideal digital laboratory environment. It’s time now to launch the next leg of that journey.

Moving Forward Into the Future

What will a lab of the future with a fully functioning ecosystem look like? The answer is: it will appear to be almost identical to the labs of today but will actually be radically different.

It will still contain liquid handlers, liquid chromatography systems, lab benches, and people in white lab coats. However, the fundamental way in which science is developed - the experiment - will utterly be transformed. Out of that transformation comes the reshaped laboratory of the future.

The four pillars of the digital ecosystem - people, process, data, and technology - will function in completely new ways to support scientific research, which should have been front and center all along anyhow.

Let’s start with data. Simply put, laboratories will start to fail without contextualized and FAIR (Findable, Accessible, Interoperable, Reusable) data. FAIR data means metadata which means capturing essential information about the how and what of an experiment. The contextualization adds enormous wealth to the value of the data, which can then be located easily, used and reused, and shared. Without it, laboratories will remain a sieve through which they lose valuable information.

Next, the process. This will be the most altered because it will be oriented to capture data from the scientific experimental process. If this doesn’t sound radical, it is.

From today’s start-and-stop-and-start-all-over-again way of experimentation, time and knowledge are continually wasted. Riffyn was determined to create a scientific process data system that doesn’t have any stop signs or red lights.

Riffyn Nexus lets scientists map the process of their experiment and change any aspect they want while accumulating data at every step along the way. Nothing is lost. There are no barriers to change. Even better, Riffyn Nexus cleans and shapes every process parameter and data point into a standard data frame ready for machine learning. Scientists can start their experiments knowing that their investigations will move along a clear path into a future where advanced analytics are guaranteed.

People. Inextricably intertwined with process, the roles and responsibilities of laboratory staff will be different. Transitioning into these news roles is itself a process of learning new ways of thinking about and interacting with scientific data.

When leaders support this, the entire organization begins to value data and is incentivized to embrace change. The rewards are enormous. Satisfaction drastically increases when scientists are free to do science, and when data analysts can actually do sophisticated analysis rather than running around trying to locate data or figure out where it came from.

Finally, technology. Isn’t data a form of technology? Yes, but the technology pillar of a digital ecosystem really means open connectivity. All of the software tools and services that are used in laboratories must become interoperable.

Just as laboratories are expected to change, so too should be the vendors who supply their tools. Within the digital ecosystem, the idea of partnership gets reshaped too. The future is too promising and the need for discoveries too pressing for science to be locked within proprietary tools that don’t support overall pipeline advancement. The connections of any solution - the beginnings and ends - must be created in such a way that they are easy to deploy.

If "digital ecosystem" sounds like the most recent buzzwords for the press and marketers to get your attention, it isn’t. Call it what you will, the profound change that these four pillars can bring should have everyone celebrating - from main street, to the lab, to Wall Street. Why? Because better experiment designs and better scientific outcomes will allow scientists and engineers to move twice as fast. They’ll be able to share data, information, and knowledge with collaborators with minimal effort, mistakes, and delays.

The scientific foundation for our future lies in clean, contextualized, and shared “model-quality” data in a digital ecosystem. Such an ecosystem, and the machine learning it enables, will never replace the human mind and intuition. But it will allow the mind to drive the future in a much more effective and efficient way. It will allow us all to Discover More. I wonder what things will be like in 30,000 years?!