Dockerized Apache Airflow Setup for Local Development

github repo

A Dockerized Apache Airflow setup designed for straightforward orchestration and management of workflows in a local development environment. This repository provides a Docker Compose configuration integrating Airflow with PostgreSQL and Redis, enabling rapid deployment and testing.


Features

  • Complete Docker Compose environment including Airflow components, PostgreSQL, and Redis
  • Custom Dockerfile to build Airflow images tailored for this setup
  • Sample DAGs demonstrating basic and advanced workflow orchestration
  • Fernet key generation script to secure Airflow metadata
  • Organized directories for DAGs, logs, plugins, and SQL scripts

Tech Stack

  • Apache Airflow 2.x
  • Docker & Docker Compose
  • PostgreSQL 13
  • Redis 6.2
  • Python (for DAG definitions and utility scripts)
  • Neo4j (integrated in sample DAG for graph database operations)

Getting Started

Prerequisites

  • Docker
  • Docker Compose

Installation & Running

  1. Clone the repository:
git clone https://github.com/justin-napolitano/airflow-docker.git
cd airflow-docker
  1. Generate a Fernet key (used by Airflow for encryption):
python3 fernet_key_generator.py
  1. Export the Fernet key and set the SQL Alchemy connection string for PostgreSQL in your shell environment:
export AIRFLOW__CORE__FERNET_KEY="<your_generated_fernet_key>"
export AIRFLOW__DATABASE__SQL_ALCHEMY_CONN="postgresql+psycopg2://airflow:airflow@postgres/airflow"
  1. Build and start the Docker containers:
docker-compose up --build
  1. Access the Airflow webserver at http://localhost:8089.

  2. Place your DAG files inside the dags/ directory to have them automatically loaded.

Project Structure

airflow-docker/
β”œβ”€β”€ dags/                    # Airflow DAG definitions
β”‚   β”œβ”€β”€ hello-world.py       # Example DAG printing "Hello, world!"
β”‚   └── sup_court_graph_workflow.py  # DAG interacting with Neo4j graph database
β”œβ”€β”€ logs/                   # Airflow logs
β”œβ”€β”€ plugins/                # Custom Airflow plugins (empty by default)
β”œβ”€β”€ sql/                    # Cypher and SQL scripts for workflows
β”œβ”€β”€ Dockerfile              # Custom Dockerfile to build Airflow image
β”œβ”€β”€ docker-compose.yml      # Docker Compose configuration
β”œβ”€β”€ fernet_key_generator.py # Script to generate Fernet key
β”œβ”€β”€ generate_fernet_key.sh  # Shell script alternative to generate Fernet key
β”œβ”€β”€ README.md               # This file
β”œβ”€β”€ requirements.txt        # Python dependencies

Future Work / Roadmap

  • Expand sample DAGs with more complex workflows and integrations
  • Add automated tests for DAGs and environment setup
  • Provide support for additional Airflow plugins and operators
  • Enhance documentation with troubleshooting and advanced configuration guides
  • Explore deployment options beyond local Docker, e.g., Kubernetes
hjkl / arrows Β· / search Β· :family Β· :tag Β· :datefrom Β· :dateto Β· ~/entries/slug Β· Ctrl+N/Ctrl+P for suggestions Β· Ctrl+C/Ctrl+G to cancel
entries 201/201 Β· entry -/-
:readyentries 201/201 Β· entry -/-