Based on Python
Python is used to describe ETL/ELT processes. And anyone with knowledge of Python will find using Airflow easy.
A small but full-fledged toolkit
Great for creating and managing data processing processes. Working with AirFlow is possible using CLI, REST API and a web interface created on the basis of the Flask Python framework.
AirFlow supports many databases (MySQL, PostgreSQL, DynamoDB, Hive), big data storage (HDFS, Amazon S3), and cloud platforms (Google Cloud platform, Amazon Web Services, Microsoft Azure).
An extensible REST API
Makes it relatively easy to integrate Airflow into an existing enterprise IT landscape and flexibly customize data pipelines.
Monitoring and alerting
Integration with Statsd and FluentD is supported for collecting and sending metrics and logs.
AirFlow provides 5 roles with different access levels: Admin, Public, Viewer, Op, User. Integration with Active Directory and flexible access configuration using RBAC are possible.
It’s possible to use basic unit tests to test pipelines and specific tasks in them.
Airflow is scalable due to its modular architecture and message queue for an unlimited number of DAGs
AirFlow is actively maintained by the community and has well documented documentation.
What is Luigi?
Luigi is a Python framework for building complex sequences of dependent tasks. A fairly large part of the framework is aimed at transforming data from various sources (MySQL, MongoDB, Redis) and using various tools (from starting a process to executing tasks of various types on a Hadoop cluster).Learn more