Open Issues Need Help
View All on GitHubAI Summary: This task requires creating a minimal AWS infrastructure using Terraform to deploy a data ingestion service on ECS Fargate. The infrastructure should include an ECS cluster, ECR repository, IAM roles, security groups, and Secrets Manager integration. The ingestion service's scheduling will be handled either by Prefect or the Python `schedule` library, not AWS EventBridge. The task also involves updating the ingestion service's code to use the chosen scheduling method and configuring it to interact with the new infrastructure.
AI Summary: Implement a system to download and store PDF documents linked in a grievance database. This involves creating a document download service, handling local and S3 storage, managing errors, batch processing, and integrating with the existing data ingestion pipeline. The system should track download progress and handle various document types and potential issues like large files, duplicates, and security concerns.
AI Summary: Implement an API request tracking system to prevent redundant calls during data ingestion. This involves creating a database table to track processed API requests, modifying the ingestion logic to check for already processed requests, and adding CRUD operations for managing the tracking data. The system should handle failures gracefully, allow configurable data freshness, and include logging and testing.