NeuroNews is an advanced ETL pipeline designed to scrape, analyze, and visualize politics and technology news using AI-powered NLP, sentiment analysis, and knowledge graph-based insights. Built on AWS cloud infrastructure, it enables real-time event detection, entity linking, and customizable dashboards for deeper news intelligence.

2 Open Issues Need Help Last updated: Sep 6, 2025

Open Issues Need Help

View All on GitHub

AI Summary: This issue outlines the creation of a PySpark ingestor to move data from RSS/HTTP sources into a `bronze.articles_raw` Delta table. It requires a Python script to fetch data and write newline-delimited JSON to a landing zone, followed by a Spark Structured Streaming job to read, parse, enforce schema, and write to the Delta table with checkpointing. A local `make spark-demo` target will facilitate testing by dropping a fixture and starting the job, with success verified by new data appearing in the bronze table.

Complexity: 3/5
feature backend good first issue ingestion

NeuroNews is an advanced ETL pipeline designed to scrape, analyze, and visualize politics and technology news using AI-powered NLP, sentiment analysis, and knowledge graph-based insights. Built on AWS cloud infrastructure, it enables real-time event detection, entity linking, and customizable dashboards for deeper news intelligence.

Python

AI Summary: This issue aims to establish a local development environment using Docker Compose, integrating Spark (master, worker, history server), MinIO for object storage, and Delta Lake for data tables. It also includes monitoring tools like Prometheus and Grafana, with specific configuration files and Make targets for easy management.

Complexity: 3/5
feature good first issue infra ingestion

NeuroNews is an advanced ETL pipeline designed to scrape, analyze, and visualize politics and technology news using AI-powered NLP, sentiment analysis, and knowledge graph-based insights. Built on AWS cloud infrastructure, it enables real-time event detection, entity linking, and customizable dashboards for deeper news intelligence.

Python