Break the scheduled workflow complexity with Airflow

Full Featured (30 min.)
[Infrastructure]

Everyone starts the same way when they need to handle a scheduled task: you let *nix Cron to handle that “easy” task. You then realized that you need to move data from here to there at some given time, so you continue sticking to the same *nix Cron approach. But as the business growth, handling scheduled tasks for different missions at scale becomes a concern with too much complexity:

  • Task logging
  • Task timeouts
  • Retries on failure
  • Task priorities & SLA
  • Task duration history
  • Notifications on task error
  • Task dependencies (fan-in / fan-out)
  • Task Modularity
  • Parallelize tasks

In this talk we will take a deep dive into Apache Airflow, and how it helps us solve that complexity.