Mar 12, 2020
Flyte: Lyft Data Processing Platform with Allyson Gale and Ketan Umare
Lyft is a ridesharing company that generates a high volume of data every day.
This data includes ride history, pricing information, mapping, routing, and financial transactions. The data is stored across a variety of different databases, data lakes, and queueing systems, and is processed at scale in order to generate machine learning models, reports, and data applications.
Data workflows involve a set of interconnected systems such as Kubernetes, Spark, Tensorflow, and Flink. In order for these systems to work together harmoniously, a workflow manager is often used to orchestrate them together. A workflow platform lets a data engineer have a high-level view into how data moves through the system, and can be used to reason about retries, resource utilization, and scalability.
Flyte is a data processing system built and open-sourced at Lyft. Allyson Gale and Ketan Umare work at Lyft, and they join the show to talk about how Flyte works, and why they needed to build a new workflow proce…