Docker Compose Setup for Ingesting Hacker News into Postgres

github repo

This is a minimal, production-leaning starter you can docker compose up to ingest Hacker News into Postgres (raw), then run a SQL upsert to a canonical tech staging table, and build a basic velocity mart.

What's included

  • apps/tech: Python ingestion job (Hacker News) — one niche worker
  • libs/: shared persistence + HTTP client utilities
  • sql/: DDL + transforms (Bronze→Silver→Gold)
  • runner/: simple SQL runner to apply transforms in order
  • docker-compose: Postgres + Adminer + Tech worker + SQL runner
  • .env.example: env vars for Postgres connection

Quick start

cp .env.example .env
docker compose up --build

Then open Adminer at http://localhost:8080 (system: PostgreSQL, server: db, user: postgres, pass: postgres, db: trends)

Apply transforms

The runner container will auto-run SQL in /sql on startup. You can re-run:

docker compose run --rm runner python /app/runner.py

Next steps

  • Add more sources (Reddit, GitHub Trending) to apps/tech/main.py.
  • Create apps/finance and sql/silver/stg_trend_items_finance.sql with same pattern.
  • Decide on dbt vs. custom runner (dbt Core recommended once models grow).

Note: The ingestion job respects idempotency via unique payload hashes and upserts by id and (source,url). Tune polling cadence in apps/tech/config.yaml.

hjkl / arrows · / search · :family · :tag · :datefrom · :dateto · ~/entries/slug · Ctrl+N/Ctrl+P for suggestions · Ctrl+C/Ctrl+G to cancel
entries 201/201 · entry -/-
:readyentries 201/201 · entry -/-