Data Scientist / Data Analytics Engineer

Riffyn, Inc. - Oakland, CA -

As a developer of Riffyn’s data analytics infrastructure, you will design and implement the technology engine driving Riffyn's data processing pipeline. You will develop capabilities for user-specified mathematical transformations and scripting, and integrate with open-source math and statistical packages. You will develop a high-performance, scalable computing infrastructure to ensure that these capabilities gracefully support our growing user base and their processing requirements. You will work closely with the Riffyn team to seamlessly integrate your work with full software stack.

Riffyn is a venture-backed provider of research design and analytics software (SaaS) to biotech, pharmaceutical, food and chemical industries. Riffyn SaaS offers a unique “design-first” approach to scientific experimentation and product development that solves previously intractable data fragmentation and analysis issues.  Riffyn provides global R&D organizations with unprecedented access to high-quality data, process design information and integrative data analytics.


  • Lead feature development while engaging internal stakeholders
  • Build high-performance data analytics and scalable cloud computing pipeline
  • Investigate state-of-the-art technologies
  • Write production quality code, including unit tests
  • Resolve bugs and integration issues


  • MS or PhD in quantitative discipline, graduate degree preferred
  • 5+ years professional experience in software/data engineering/informatics
  • Excellence in communication with colleagues using formal reporting and informal dialog
  • Mastery of a scripting language, e.g., Python, for calling APIs of open source software
  • Experience shell scripting in a *nix environment
  • Experience optimizing ETL with SQL and noSQL databases
  • Experience with source control, e.g., git
  • Experience with distributed/high-performance computing e.g., Hadoop, Spark


  • Experience in the analysis of scientific data in life sciences using e.g., R, JMP
  • Experience working in an agile software development environment
  • Experience publishing and maintaining open source Github repositories
  • Knowledge of the complexity of statistical calculations and machine learning algorithms
  • Experience with management of asynchronous task queues, e.g., Celery, RabbitMQ
  • Familiarity with columnar and graph database technologies, e.g., Cassandra, Neo4j
  • Familiarity with AWS
  • Web application development experience using Javascript related technologies
  • Production experience with designing and building Docker images

  Apply (