Artwork

Innhold levert av The Data Flowcast. Alt podcastinnhold, inkludert episoder, grafikk og podcastbeskrivelser, lastes opp og leveres direkte av The Data Flowcast eller deres podcastplattformpartner. Hvis du tror at noen bruker det opphavsrettsbeskyttede verket ditt uten din tillatelse, kan du følge prosessen skissert her https://no.player.fm/legal.
Player FM - Podcast-app
Gå frakoblet med Player FM -appen!

Inside Vinted’s Code-Generated Airflow Pipelines with Oscar Ligthart and Rodrigo Loredo

29:36
 
Del
 

Manage episode 515154788 series 2948506
Innhold levert av The Data Flowcast. Alt podcastinnhold, inkludert episoder, grafikk og podcastbeskrivelser, lastes opp og leveres direkte av The Data Flowcast eller deres podcastplattformpartner. Hvis du tror at noen bruker det opphavsrettsbeskyttede verket ditt uten din tillatelse, kan du følge prosessen skissert her https://no.player.fm/legal.

The shift from monolithic to decentralized data workflows changes how teams build, connect and scale pipelines.

In this episode, we feature Oscar Ligthart, Lead Data Engineer, and Rodrigo Loredo, Lead Analytics Engineer, both at Vinted, as we unpack their YAML-driven abstraction that generates Airflow DAGs and standardizes cross-team orchestration.

Key Takeaways:

00:00 Introduction.

05:28 Challenges of decentralization.

06:45 YAML-based generator standardizes pipelines and dependencies.

12:28 Declarative assets and sensors align cross-DAG dependencies.

17:29 Task-level callbacks enable auto-recovery and clear ownership.

21:39 Standardized building blocks simplify upgrades and maintenance.

24:52 Platform focus frees domain work.

26:49 Container-only standardization prevents sprawl.

Resources Mentioned:

Oscar Ligthart

https://www.linkedin.com/in/oscar-ligthart/

Rodrigo Loredo

https://www.linkedin.com/in/rodrigo-loredo-410a16134/

Vinted | LinkedIn

https://www.linkedin.com/company/vinted/

Vinted | Website

https://www.vinted.com/?srsltid=AfmBOor87MGR_eLOauCO93V9A-aLDaAhGYx9cnu_oN8s1SAXMlCRuhW7

Apache Airflow

https://airflow.apache.org/

Kubernetes

https://kubernetes.io/

dbt

https://www.getdbt.com/

Google Cloud Vertex AI

https://cloud.google.com/vertex-ai

Airflow Datasets & Assets (concepts)

https://www.astronomer.io/docs/learn/airflow-datasets

Airflow Summit

https://airflowsummit.org/

Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

#AI #Automation #Airflow #MachineLearning

  continue reading

82 episoder

Artwork
iconDel
 
Manage episode 515154788 series 2948506
Innhold levert av The Data Flowcast. Alt podcastinnhold, inkludert episoder, grafikk og podcastbeskrivelser, lastes opp og leveres direkte av The Data Flowcast eller deres podcastplattformpartner. Hvis du tror at noen bruker det opphavsrettsbeskyttede verket ditt uten din tillatelse, kan du følge prosessen skissert her https://no.player.fm/legal.

The shift from monolithic to decentralized data workflows changes how teams build, connect and scale pipelines.

In this episode, we feature Oscar Ligthart, Lead Data Engineer, and Rodrigo Loredo, Lead Analytics Engineer, both at Vinted, as we unpack their YAML-driven abstraction that generates Airflow DAGs and standardizes cross-team orchestration.

Key Takeaways:

00:00 Introduction.

05:28 Challenges of decentralization.

06:45 YAML-based generator standardizes pipelines and dependencies.

12:28 Declarative assets and sensors align cross-DAG dependencies.

17:29 Task-level callbacks enable auto-recovery and clear ownership.

21:39 Standardized building blocks simplify upgrades and maintenance.

24:52 Platform focus frees domain work.

26:49 Container-only standardization prevents sprawl.

Resources Mentioned:

Oscar Ligthart

https://www.linkedin.com/in/oscar-ligthart/

Rodrigo Loredo

https://www.linkedin.com/in/rodrigo-loredo-410a16134/

Vinted | LinkedIn

https://www.linkedin.com/company/vinted/

Vinted | Website

https://www.vinted.com/?srsltid=AfmBOor87MGR_eLOauCO93V9A-aLDaAhGYx9cnu_oN8s1SAXMlCRuhW7

Apache Airflow

https://airflow.apache.org/

Kubernetes

https://kubernetes.io/

dbt

https://www.getdbt.com/

Google Cloud Vertex AI

https://cloud.google.com/vertex-ai

Airflow Datasets & Assets (concepts)

https://www.astronomer.io/docs/learn/airflow-datasets

Airflow Summit

https://airflowsummit.org/

Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.

#AI #Automation #Airflow #MachineLearning

  continue reading

82 episoder

Alle episoder

×
 
Loading …

Velkommen til Player FM!

Player FM scanner netter for høykvalitets podcaster som du kan nyte nå. Det er den beste podcastappen og fungerer på Android, iPhone og internett. Registrer deg for å synkronisere abonnement på flere enheter.

 

Hurtigreferanseguide

Copyright 2025 | Personvern | Vilkår for bruk | | opphavsrett
Lytt til dette showet mens du utforsker
Spill