Open Issues Need Help
View All on GitHub Document Processing • Text Extraction & Conversion
AI Summary: Refactor a Python text extraction script (TextNomNom) to improve its modularity and maintainability. This involves reorganizing the code into a more structured directory layout, implementing a configuration management system, and ensuring all existing functionality remains intact after the refactoring. The proposed structure includes modules for web scraping, configuration management, file utilities (PDF, PPT, image processing), and browser automation.
Complexity:
4/5
enhancement help wanted good first issue
Extract text from PDFs, PPTs, & URLs (with OCR support). Converts PPT to PDF & handles files or folders. 🦍
Python
#automated-conversion#automation#cross-platform#document-conversion#image-text-extraction#linux#pdf-processing#pdf-to-text#ppt#ppt-to-text#pptx#pptx-to-text#text-extraction#windows