Open Issues Need Help
View All on GitHubAI Summary: Enhance the web crawler's robots.txt handling to correctly interpret wildcard directives such as `Disallow: /*?sort=` and `Allow:` entries, improving adherence to website robots.txt rules.
 
Complexity:
  4/5
 
   help wanted  
 Async web crawler written in Python. Modular, lightweight and SQLite/MySQL ready.
  Python  
  #async#async-crawler#aysncio#beatifulsoup4#crawler#data-mining#mysql#open-source#python#scraping#search-engine#seo-bot#sqlite#web-crawler