Show Notes

One of the big promises of data science is its ability to combine multiple disparate datasets to produce value-creating insights. But this is only possible if you can get all those disparate datasets together, in the one location, to begin with. The has led to the rise of the data engineer and the data orchestration platform.

In this episode, Sandy Ryza joins Dr Genevieve Hayes to discuss the impact of the data scientist on the creation of the next generation of data orchestration tools.

Guest Bio

Sandy Ryza is a data scientist turned data engineer who is currently the lead engineer on the Dagster project, an open-source data orchestration platform used in MLOps, data science, IOT and analytics. He is also the co-author of Advanced Analytics with Spark.


  • Welcome to Value Driven Data Science (00:00)
  • Introducing Sandy Ryza and his journey from data scientist to data engineer (01:30)
  • Navigating the challenges of creating consistent data definitions within teams (05:11)
  • The birth and development of Dagster (11:32)
  • Dagster: A tool designed for data scientists (20:54)
  • Final thoughts and advice for data scientists (37:29)


podcast cover art
Value Driven Data Science
Episode 39: The Impact of Data Science on Data Orchestration