45: Trino swimming with the DolphinScheduler
Manage episode 358485307 series 2796878
DolphinScheduler is a popular Apache data workflow orchestrator that enables running complex data pipelines. They recently added a Trino integration and will be demonstrating how to use DolphinScheduler to enable a series of transformations on the data lakehouse with Trino.
- Intro Music: 0:00
- Intro: 0:31
- Trino release 407: 13:22
- What is workflow orchestration?: 21:12
- Why do we need a workflow orchestration tool for building a data lake?: 31:07
- What is Apache DolphinScheduler?: 37:35
- Does DolphinScheduler have any computing engine or storage layer?: 53:11
- What are the differences with other workflow orchestration, such as Apache Airflow?: 58:46
- Demo: Creating a simple Trino workflow in DolphinScheduler: 1:26:44
- PR: Improve performance of Parquet files: 1:47:04
Show Notes: https://trino.io/episodes/45
Show Page: https://trino.io/broadcast/
59 episodi