Open Issues Need Help
View All on GitHubAI Summary: The task involves creating a Docker Compose setup for local development and testing of a distributed DNS and HTTP crawler. This includes setting up Docker containers for the crawler itself, Redis for deduplication, a mock ingest endpoint, Prometheus for metrics, and Grafana for visualization. The setup should be easily deployable across different operating systems and include features like persistent volumes, environment-specific configurations, and health checks. Additional considerations include adding Jaeger for distributed tracing and providing sample data and scripts for testing.
Distributed DNS + HTTP crawler with queueing, deduplication, tracing, and ingest batching.
AI Summary: Add comprehensive unit and integration tests to the Go-based SPYDER project, targeting core packages like probe, DNS, extraction, deduplication, batch emission, robots.txt handling, TLS info retrieval, and rate limiting. Aim for at least 70% code coverage per package, using Go's standard testing package and potentially testify for assertions. Mock external dependencies where necessary and include test fixtures for mock data. Tests should be fast and self-contained.
Distributed DNS + HTTP crawler with queueing, deduplication, tracing, and ingest batching.
AI Summary: Enhance the CLI of a distributed DNS and HTTP crawler (written in Go) to improve user experience. This involves adding progress bars, real-time statistics, a dry-run mode, verbose/quiet options, stdin support, subcommands (crawl, validate, export, stats), shell completion, a version flag, improved help text, and potentially using a CLI framework like Cobra or urfave/cli. The goal is to make the tool more user-friendly and easier to use.
Distributed DNS + HTTP crawler with queueing, deduplication, tracing, and ingest batching.